Cache busting with versioning does not seem to work - reactjs

I am currently using versioning to bust cache. I used to generate different file name with date or version. However, it breaks google cached page because google look for the old file name.
I have a webpack setup for the chunking.
output.filename = '[name].js?v=' + hash
output.chunkFilename = '[name].js?v=' + hash
And I can see that browser requesting file with v=xxx correctly
However, sometimes I need to ask my customer to open up dev tool and click clear cache and hard refresh because normal refresh does not work somehow.
I also use Cloudflare cdn and it does have cache policy.
Cloudflare response headers.
cache-control: max-age=31536000
cf-bgj: minify
cf-cache-status: HIT
cf-polished: origSize=9873
How to make sure browser and cloudflare purge all the js and css files when the new code is pushed ?
Do not know what to do when normal refresh does not work.

On Cloudflare there are several ways to control the behavior of the cache
Understanding the Cloudflare CDN (general rules)
Cache level (can be configured to consider or ignore the querystring)
Page Rules (useful to fine tune caching behavior based on URL patterns)
Origin Cache Control (to control the behavior based on the cache headers returned by your origin server)
You also have various options (depending on the plan) for proactively purging certain resources with Cache Purge (both from the dashboard or via APIs).
It is worth reviewing the above settings (in particular cache levels and page rules) to verify that the querystring is being considered part of the cache key used to retrieve the data. In particular, the header cf-cache-status: HIT indicates that the requested resource was fetched from the CDN cached copy.

Related

Force updates on installed PWA when changing index.html (prevent caching)

I am building a react app, which consists in a Single Page Application, hosted on Amazon S3.
Sometimes, I deploy a change to the back-end and to the front-end at the same time, and I need all the browser sessions to start running the new version, or at least those whose sessions start after the last front-end deploy.
What happens is that many of my users still running the old front-end version on their phones for weeks, which is not compatible with the new version of the back-end anymore, but some of them get the updates by the time they start the next session.
As I use Webpack to build the app, it generates bundles with hashes in their names, while the index.html file, which defines the bundles that should be used, is uploaded with the following cache-control property: "no-cache, no-store, must-revalidate". The service worker file has the same cache policy.
The idea is that the user's browser can cache everything, execpt for the first files they need. The plan was good, but I'm replacing the index.html file with a newer version and my users are not refetching this file when they restart the app.
Is there a definitive guide or a way to workaround that problem?
I also know that a PWA should work offline, so it has to have the ability to cache to reuse, but this idea doesn't help me to perform a massive and instantaneous update as well, right?
What are the best options I have to do it?
You've got the basic idea correct. Why your index.html is not updated is a tough question to answer to since you're not providing any code – please include your Service Worker code. Keep in mind that depending on the logic implemented in the Service Worker, it doesn't necessarily honor the HTTP caching headers and will cache everything including the index.html file, as it seems now is happening.
In order to have the app work also in offline mode, you would probably want to use a network-first SW strategy. Using network-first the browser tries to load files from the web but if it doesn't succeed it falls back to the latest cached version of the particular file it tried to get. Another option would be to choose what is called a stale-while-revalidate strategy. That first gives the user the old file (which is super fast) and then updates the file in the background. There are other strategies as well, I suggest you read through the documentation of the most widely used SW library Workbox (https://developers.google.com/web/tools/workbox/modules/workbox-strategies).
One thing to keep in mind:
In all other strategies except "skip SW and go to the network", you cannot really ensure the user gets the latest version of the index.html. It is not possible. If the SW gives something back from the cache, it could be an old version and that's that. In these situations what is usually done is a notification to the user that a new version of the app has been donwloaded in the background. Basically user would load the app, see the version that was available in the cache, and SW would then check for updates. If an update was found (there was a new index.html and, because of that, new service-worker.js), the user would see a notification telling that the page should be refreshed. You can also trigger the SW to check for an update from the server manually from your own JS code if you want. In that situation, too, you would show a notification to the user.
Does this help you?

Change to static file doesn't happen immediately after deploy

When I change a static file (here page.html), and then run appcfg.py update, even after deployment is successful and it says the new files are serving, if I curl for the file the change has not actually taken place.
Relevant excerpt from my app.yaml:
default_expiration: "10d"
- url: /
static_files: static/page.html
upload: static/page.html
secure: always
Google's docs say "Static cache expiration - Unless told otherwise, web proxies and browsers retain files they load from a website for a limited period of time." There shouldn't be any browser cache as I am using curl to get the file, and I don't have a proxy set up at home at least.
Possible hints at the answer
Interestingly, if I curl for /static/page.html directly, it has updated, but if I curl for / which should point to the same file, it has not.
Also if I add some dummy GET arg, such as /?foo, then I can also see the updated version. I also tried adding the -H "Cache-Control: no-cache" option to my curl command, but I still got the stale version.
How do I see updates to / immediately after deploy?
As pointed out by Omair, the docs for the standard environment for Pyhton state that "files are likely to be cached by the user's browser, as well as by intermediate caching proxy servers such as Internet Service Providers". But I've found a way to flush static files cached by your app on Google Cloud.
Head to your Google Cloud Console and open your project. Under the left hamburger menu, head to Storage -> Browser. There you should find at least one Bucket: your-project-name.appspot.com. Under the Lifecycle column, click on the link with respect to your-project-name.appspot.com. Delete any existing rules, since they may conflict with the one you will create now.
Create a new rule by clicking on the 'Add rule' button. For the object conditions, choose only the 'Newer version' option and set it to 1. Don't forget to click on the 'Continue' button. For the action, select 'Delete' and click on the 'Continue' button. Save your new rule.
This new rule will take up to 24 hours to take effect, but at least for my project it took only a few minutes. Once it is up and running, the version of the files being served by your app under your-project-name.appspot.com will always be the latest deployed, solving the problem. Also, if you are routinely editing your static files, you should remove any expiration element from handlers related to those static files and the default_expiration element from the app.yaml file, which will help avoid unintended caching by other servers.
According to App Engine's documentation on static cache expiration, this could be due to caching servers between you and your application respecting the caching headers on the responses:
The expiration time will be sent in the Cache-Control and Expires HTTP response headers, and therefore, the files are likely to be cached by the user's browser, as well as by intermediate caching proxy servers such as Internet Service Providers.
Once a file is transmitted with a given cache expiration time, there is generally no way to clear it out of intermediate caches, even if you clear the browser cache or use Curl command with no-cache option. Re-deploying a new version of the app will not reset caches as well.
For files that needs to be modified, shorter expire times are recommended.

HTML files not cached when requested on localhost

recently we made changes to one of our applications and we noticed that our customers are not getting the new views. So we decided to version the files so we can force the client's browsers to fetch the new views every time we have a new version.
So far so good, but we needed to test this in a local environment before deploying this change (the versioning). Unfortunately, on localhost the views are never cached. I noticed that the requests for the views are sent with Cache-Control:max-age=0. If I am not mistaking this causes the resource to not be cached.
I read also that this could be caused by the ETag header, so I removed it but the views are still not cached. Also, I set the Cache-Control:max-age=86400,public header in the response. So the only reason left was the Cache-Control:max-age=0 header in the request. So I tried to change the header. I set the cache-control header in the request to be Cache-Control:max-age=86400,public, but still no luck.
The views are requested by AngualarJS, they are templates in directives. There is also a difference in the IIS version that we are using locally and that on the server. Locally we are using 7.5 and on the server, it is 8.0. Could this be the problem?
Can anyone guide me to the right direction?
Edit:
The Disable Cache option in the chrome dev tools is disabled.
One thing I can think about is that you have the Disable Cache enabled in your browser, if it's just your local system:
Normally, getting around Browser caching is quite tricky, so most people have trouble disabling browser caching using headers. The Cache-Control:max-age unfortunately is not uniformly implemented across browsers. If the issue is still occurring inspite of the above, could you provide screenshots from the network tab on your Chrome developer tools?

Does changing Cloudfront Download Distribution Origin Path result in a cache invalidation?

I am working on a solution to get S3 and Cloudfront in sync when I upload a new version of an angular app.
My approach is to upload the new version to a new folder with an increasing version number http://awsbucket/v1 ... /v2 and after that updating the Download Distribution Origin Path to that new folder.
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesOriginPath
I am wondering if this change of the Origin Path automatically results in a complete cache invalidation or if i have to send invalidation requests never the less.
So if you keep moving your web resources ( images, scripts or any thing that can be sent over http) to various versions and do to necessary changes in your app; by design; intentionally you would starting using the newer versions resources - the older version's cache would go colder and colder and eventually being taken out of the cache.
The invalidation requests are costly, time consuming while the versioning is easy and natural. The best use cases was found in the areas of newer CSS stylesheets, updation in js scripts being versioned. The same can be extrapolated for your use case.
Also you don't need to change the origin; keep adding the new files to the S3 and ensure the same are being reflected in the app- that would do.
To answer your question, NO - changing the Origin, including just the path, does not result in cache invalidation.
Information can be found here
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesDomainName
Quoting the specific part:
Changing the origin does not require CloudFront to repopulate edge caches with objects from the new origin. As long as the viewer requests in your application have not changed, CloudFront will continue to serve objects that are already in an edge cache until the TTL on each object expires or until seldom-requested objects are evicted.

Caching & GZip on GAE (Community Wiki)

Why does it seem like Google App Engine isn’t setting appropriate cache-friendly headers (like far-future expiration dates) on my CSS stylesheets and JavaScript files? When does GAE gzip those files? My app.yaml marks the respective directories as static_dirs, so the lack of far-future expiration dates is kind of surprising to me.
This is a community wiki to showcase the best practices regarding static file caching and gzipping on GAE!
How does GAE handle caching?
It seems GAE sets near-future cache expiration times, but does use the etag header. This is used so browsers can ask, “Has this file changed since when it had a etag of X68f0o?” and hear “Nope – 304 Not Modified” back in response.
As opposed to far-future expiration dates, this has the following trade-offs:
Your end users will get the latest copies of your resources, even if they have the same name (unlike far-future expiration). This is good.
Your end users will however still have to make a request to check on the status of that file. This does slow down your site, and is “pure overhead” when the content hasn’t changed. This is not ideal.
Opting for far-future cache expiration instead of (just) etag
To use far-future expiration dates takes two steps and a bit of understanding.
You have to manually update your app to request new versions of resources, by e.g. naming files like mysitesstyles.2011-02-11T0411.css instead of mysitestyles.css. There are tools to help automate this, but I’m not aware of any that directly relate to GAE.
Configure GAE to set the expiration times you want by using default_expiration and/or expiration in app.yaml. GAE docs on static files
A third option: Application manifests
Cache manifests are an HTML5 feature that overrides cache headers. MDN article, DiveIntoHTML5, W3C. This affects more than just your script and style files' caching, however. Use with care!
When does GAE gzip?
According to Google’s FAQ,
Google App Engine does its best to serve gzipped content to browsers that support it. Taking advantage of this scheme is automatic and requires no modifications to applications.
We use a combination of request headers (Accept-Encoding, User-Agent) and response headers (Content-Type) to determine whether or not the end-user can take advantage of gzipped content. This approach avoids some well-known bugs with gzipped content in popular browsers. To force gzipped content to be served, clients may supply 'gzip' as the value of both the Accept-Encoding and User-Agent request headers. Content will never be gzipped if no Accept-Encoding header is present.
This is covered further in the runtime environment documentation (Java | Python).
Some real-world observations do show this to generally be true. Assuming a gzip-capable browser:
GAE gzips actual pages (if they have proper content-type headers like text/html; charset=utf-8)
GAE gzips scripts and styles in static_dirs (defined in app.yaml).
Note that you should not expect GAE to gzip images like GIFs or JPEGs as they are already compressed.

Resources