Is serving static files through `static_files` and `static_dir` affected by scaling - google-app-engine

If part of my app.yaml file looks like this:
handlers:
- url: /favicon\.ico
static_files: favicon.ico
upload: favicon\.ico
- url: /static
static_dir: public
- url: /.*
secure: always
redirect_http_response_code: 301
script: auto
automatic_scaling:
min_idle_instances: automatic
max_idle_instances: automatic
min_pending_latency: automatic
max_pending_latency: automatic
max_concurrent_requests: 1
min_instances: 1
max_instances: 10
Then is my static content also affected by the scaling parameters for the app? Example, it would run with the same max_concurrent_requests restriction per node, or not?
My assumption is that serving /static would be a completely different layer independent from the instances running for your app in GAE. I was trying to find an architecture diagram confirming this kind of decoupling (maybe a diagram with nginx running with a LB to the GAE Application Instance nodes).
Ideally, a clear answer would be qualified with a reference to Google Cloud documentation material.
Closest related doc I found was this, but it does not clearly answer my question:
Storing and Serving Static Files

Your understanding of the static file serving architecture is correct. App Engine will handle the static file request directly without letting the requests get to the language runtime.
Because of that, these requests will not be affected by the scaling settings the same way as the "regular" requests would. The max_concurrent_requests is a good example of that.
I have requested an update to the documentation page you referenced to add this information there.

Related

Google App Engine -- Deploying a new version will make my site go down

I have a flask + react application that is deployed on Google App Engine. Recently, I discovered that each time I deployed a new version to the GAE, my site would go down for a few hours, and several web pages cannot load correctly. I checked the console, the web application is trying to get the static files from the last version, which resulted in a 404 Error. Can anyone help me to find what the problem is?
Here is my app.yaml file:
runtime: python37
env: standard
default_expiration: "5m"
entrypoint: gunicorn -b :$PORT main:app --timeout 150
instance_class: F4
automatic_scaling:
max_instances: 5
min_instances: 1
min_pending_latency: "5s"
target_cpu_utilization: 0.75
inbound_services:
- warmup
handlers:
- url: /static/js/(.*)
static_files: build/static/js/\1
upload: build/static/js/(.*)
- url: /static/css/(.*)
static_files: build/static/css/\1
upload: build/static/css/(.*)
- url: /static/media/(.*)
static_files: build/static/media/\1
upload: build/static/media/(.*)
- url: /(.*\.(json|ico))$
static_files: build/\1
upload: build/.*\.(json|ico)$
- url: /
static_files: build/index.html
upload: build/index.html
- url: /.*
script: auto
I am here to answer my own question. I seem to find the problem and how to solve it.
The main problem seems to be a caching issue. For the app.yaml settings, although the default expiration time is set to 5m, the url with path don't have the expiration set. For example, page www.example.com/about will have a different caching time than the js package. This means when a new build folder is deployed, the js packages have been changed, but the www.example.com/about page generated by your backend application is still the old version, and it will try to request the js package from the previous build foler. Thus, causing the 404 error.
The way to solve this is to set the expiration time for your response generated by your backend application. I am using the Flask environment, so the code for that is (credited to this answer)
response["Cache-Control"] = "no-cache, no-store, must-revalidate" # HTTP 1.1.
response["Pragma"] = "no-cache" # HTTP 1.0.
response["Expires"] = "0" # Proxies.
the web application is trying to get the static files from the last version
So are these files that were removed in your new version that you just deployed?
In general, it sounds like your problem has to do with stale browser caches. I wouldn't remove static assets from your deployed app right away specifically for this reason.
I see you're using ReactJS. Are you using the features that combine all the js and css into a single file whose name contains a hash? This should help with cache-busting.
The part that's confusing is that you said it would go down for a few hours. You have default_expiration: "5m" in your app.yaml so a few hours sounds a bit extreme. Are you not doing a hard reload (or even a full reload) when you are trying to check out your changes in your browser?

Google App Engine - keep previous version's static files

We've deployed a Vue SPA to Google App Engine and it's served completely by the static handlers.
The issue that we are facing is that if a user is active on our site mid-deploy, then their old Webpack chunk manifest becomes invalid (since some chunks' hashes are overwritten). If they now try to route to a new page and that page tries to fetch a chunk that got overwritten, we get the following error:
ChunkLoadError: Loading chunk Conversations failed.
(error: https://example.com/js/Conversations.71762189.js)
Ideally, we'd like to keep N (2-3?) previous versions of the app's static files.
Is our only option to push all the assets to a Cloud Storage Bucket? If so, how would we go about pruning older versions?
Here is my app.yaml for reference:
runtime: nodejs10
instance_class: F4
automatic_scaling:
min_instances: 2
max_instances: 10
default_expiration: "30d"
error_handlers:
- file: default_error.html
handlers:
- url: /api/*
secure: always
redirect_http_response_code: 301
script: auto
- url: /js/*
secure: always
redirect_http_response_code: 301
static_dir: dist/js
- url: /css/*
secure: always
redirect_http_response_code: 301
static_dir: dist/css
- url: /img/*
secure: always
redirect_http_response_code: 301
static_dir: dist/img
- url: /(.*\.(json|js|txt))$
secure: always
redirect_http_response_code: 301
static_files: dist/\1
upload: dist/.*\.(json|js|txt)$
expiration: "10m"
- url: /.*
secure: always
redirect_http_response_code: 301
static_files: dist/index.html
upload: dist/index.html
expiration: "2m"
The issue typically happens when the deployment overwrites an existing service version which receives traffic (i.e. the service version is not changed). From Deploying an app:
Note: If you deploy a version that specifies the same version ID as a version that already exists on App Engine, the files that you
deploy will overwrite the existing version. This can be problematic if
the version is serving traffic because traffic to your application
might be disrupted. You can avoid disrupting traffic if you deploy
your new version with a different version ID and then move traffic to
that version.
As long as a service version is deployed and not deleted or overwritten its respective static assets remain accessible.
To prevent the issue always deploy using a fresh service version, then (gradually) migrate traffic to the newly deployed version. Keeping the latest N service versions around will give you those N sets of static assets you desire.
In general, this deployment practice is good/recommended for a few other reasons:
avoids potential outages, see Continuous integration/deployment/delivery on Google App Engine, too risky?
avoids potential traffic loss while GAE spins up enough new version instances to handle the traffic load, see 2nd half of GAE shutdown or restart all the active instances of a service/app
Potentially of interest: Google Frontend Retention between deployments
Deploy using the --no-promote flag, and utilize the Traffic Migration feature in the Standard Environment to gradually migrate traffic over to the new version so that all users don't experience an immediate switchover the moment the new version goes live. App Engine will host both the old and new versions (or, "blue" and "green") for a period of time until all traffic points to the new version, and then the old version will be shut down.
See also:
Testing on App Engine
Migrating and Splitting Traffic with the Admin API
Blue-Green Deployment Pattern

Google App Engine edge caching via cache-control?

Does App Engine cache responses server side for either dynamic or static requests if I set Cache-Control headers? Documentation doesn't seem to clarify this either way https://cloud.google.com/appengine/docs/standard/php/how-requests-are-handled
I have an API that responds highly cachable responses, so it'd be nice to leverage any edge caching.
You can set the cache in your app.yaml file for static files
- url: /static
static_dir: static
expiration: 10m
You can set a a default cache in your app.yaml file
application: my-app
version: 1
runtime: python27
api_version: 1
threadsafe: yes
module: default
default_expiration: "1h"
instance_class: F2
For caching json/response data from request handlers you can use Memcache

Restricting access to logged in users for static files in google app engine app by way of yaml rules fails

I tried to use the documented way of restricting access to urls marked as static by way of login: required rules in the app.yaml file. My intention is to have access to script urls handled by the go programming language by xmlhttprequests, but the first step of authenticating the user before she can load the file dist/index.html fails.
Surprisingly for me the user is not prompted to login, instead receives the dist/index.html file and all other files it asks for from the static folder as if no restricting rule were present.
This is my app.yaml file:
application: helloworld
version: 1
runtime: go
api_version: go1
handlers:
- url: /
static_files: dist/index.html
upload: dist/index.html
secure: always
login: required - this is what fails as far as I'm concerned
- url: /(.*\.(txt|html|json|png|js|log|md|css|ico))
static_files: dist/\1
upload: dist/(.*\.(txt|html|json|png|js|log|md|css|ico))
secure: always
login: required
- url: /.*
script: _go_app
secure: always
login: required
The folder that I uploaded to appengine looks like this:
app.yaml
index.yaml
xhr_responses.go - this is the intended future non static AJAX part
dist/
index.html
loads of other stuff that is static
The 'login:' handler options in the .yaml config files rely on Google's authentication, which can be persisted using cookies and survive a browser restart.
To properly test the authentication you need to either use a fresh incognito browser session or go to one of the Google sites and ensure you're not logged in (explicitly log out from all Google accounts if needed) before testing.
Apparently I was signed in when trying stuff on the live google app engine, which I just forgot is the way it knows not to redirect access to a new login prompt.

App Engine taking Far Longer (> 1 sec) to serve static JS than static CSS

(Question edited b/c I have realized it involves file type)
This file is 20kb. It is consistently taking > 1 second to serve.
http://www.adrenalinemobility.com/js/ss-symbolicons.js
Here is the same file with .css as it's extension:
http://www.adrenalinemobility.com/js/ss-symbolicons.css
It serves almost 1 whole second faster.
Here is my app.yaml:
application: adrenaline-website
version: 1
api_version: 1
runtime: python27
threadsafe: true
libraries:
- name: jinja2
version: latest
handlers:
- url: /favicon\.ico
static_files: assets/favicon.ico
upload: assets/favicon\.ico
- url: /css
static_dir: assets/css
- url: /img
static_dir: assets/img
- url: /js
static_dir: assets/js
- url: /.*
script: web.APP
I've also tried this static_files line (before the /js handler), and it was slow too:
- url: /js/ss-symbolicons.js
static_files: assets/js/ss-symbolicons.js
upload: assets/js/ss-symbolicons.js
Ways I have observed this:
Chrome, Firefox (both on Linux) - from a DSL connection in Silicon Valley
wget, curl, etc from that machine.
Remotely wget and curl from a high-speed server at the University of Illinois
Remote web testing services like webpagetest (see below):
Here's a webpagetest waterfall graph that illustrates this problem - notice the one file has a huge TTFB: http://www.webpagetest.org/result/131101_ZQ_ZGQ/1/details/
If i manually set the mime_type to text, then it goes fast. application/javascript, application/x-javascript, text/javascript are all slow. Currently those files are serving without manually specified mime-type if you wish to test.
Some more info, as noticed by jchu:
The slow version serves with: Content-Length: 19973 and the fast version serves with: Transfer-Encoding: chunked
Still more details:
I usually get server 74.125.28.121. Someone on reddit got server 173.194.71.121 and it seems to have even serving speeds between them. So maybe it's server/location dependent?
Another post about this issue
Here is a pastebin with full curl logs of requests for both files
Here is another pastebin with just the timing information from ten requests on each file in a tight loop
Add mime_type: text to your JavaScript static resource.
Would need to look into what mime_type is being assumed, what what clever trick is being done for text vs other mime types...
I've been seeinig the same behavior.
GAE has some strange edge caching behavior for these javascript files. After the 4th or fifth consecutive request, the response time improves significantly (some type of edge caching of the gzipped content kicks in). In my experiments it's gone from around 1.2s -> 190ms
I've been able to reproduce this consistently across 3 GAE apps.
If you use Chrome, you can go to the DevTools settings and disable cache (while DevTools is open). Reloading the page a few times will get your javascript transfer times down to the reasonable numbers.
Google's edge cache probably has some logic where it determines it won't bother caching gzipped js files until they're requested frequently enough. This would appear to encourage you to write a script that constantly downloads your js files to ensure that they download a few hundred milliseconds faster. I suspect that the edge cache is smart enough to only cache for one region at a time though.

Resources