Can vaadin-flow applications be run (reliably) on GAE standard environment? If not, what has to be done to make them run (reliably)? - google-app-engine

This is a wrap-up of several open questions in the vaadin forum (which is moving to stack-overflow) - see 2, 3, 4, ...
The basic question is: "How can one run a (recent V14 LTS or V19/20) vaadin-flow application reliably on google app engine (GAE manual or automatic scaling) standard environment (i.e. no docker, no google compute engine), without experiencing constant refreshing of components"
Vaadin has a tutorial for deploying vaadin applications to GAE flexible (not standard as for this question). This tutorial doesn't mention that one might run into trouble when GAE switches the server-instance. Even GAE mentiones Vaadin as one of the supported frameworks.
Old Vaadin 8 release notes state, that support for GAE has been dropped.
According to the questions in the vaadin forum, vaadin-flow at GAE will lead (or at least has led in older versions of vaadin) to constant refreshing of components and/or loss of session-state.
If one uses manual scaling, ones applications can rely on the state of the memory over time, so vaadin-flow applications should not be bothered with switches of the server-instance (which will eventually occur when an instance is shut down due to an error or maintenence reasons).
So the first question is: "When running a vaadin-flow application on GAE standard with manual scaling, will it lead to constant refreshing of componends and/or loss of session-state even when the instance is not switched?"
If that works, than vaadin-flow is fine on GAE standard when one needs neither dynamic scaling nor high availability. It that does not work, vaadin-flow on GAE standard would be a NOGO for any type of application (and one needs to switch to another provider or to GAE flexible and docker).
The next question is: "What has to be done to make vaadin-flow application on GAE standard run reliably, even when instances are scaled or switched due to maintenance?"
The following suggestions were stated in the forum - but noone ever confirmed that they work:
One could have set <sessions-enabled>true</sessions-enabled> for java 8. This setting no longer exists for java 11. Even when when nr. of instances is changing or instances are restarted, this could have been the solution since session data is stored in memcache which is available over all instances.
When instances are moved or shut down, google sends a shutdown notification -> one could implement a shutdown-hook and try to serialize all session-state (if vaadin provides a way to serialize it manually and automatically de-serialize it when another instance takes over).
Has anyone found a reliable solution for this?

I think it's not possible.
According to https://stackoverflow.com/a/10640357/377320 GAE uses Datastore to store the session informations and only synchronizes objects set via session.setAttribute(). However, according to https://mvysny.github.io/vaadin-14-session-replication/ Vaadin doesn't call setAttribute(), moreover Vaadin does that on purpose.
That means that GAE won't synchronize the state properly but the requests will land on random nodes (since session affinity/sticky sessions is not supported on GAE), causing the requests to land on nodes with obsolete Vaadin state. That leads me to believe that the most likely outcome is that Vaadin components will constantly lose state, and Vaadin will try to frequently attempt to resync the UI state by performing browser reloads.
Vaadin Flow works best with sticky sessions and long-running servers while GAE sounds like an opposite of that.
I believe your best bet is to use Vaadin Fusion with a stateless server.

Regarding serializing all session state (eg. on a shutdown hook): this requires that all of your stateful objects, like all of your data beans, Vaadin Flow views, compositions etc., are serializable. In other words, all classes must implement java.io.Serializable. This can be done, as all classes from Vaadin should be serializable, but it's up to the application developer to make sure that all custom instantiable classes in the codebase (and any instantiable classes in the dependencies) can be serialized and deserialized. Based on practical experience, this is not a trivial requirement - it usually drives the application's architecture by limiting or changing the design patterns used in the code. Making an existing codebase fully serializable will likely incur significant refactoring work. This is one of the big reasons why sticky sessions (which are not available in GAE) are the recommended approach for deploying Vaadin Flow applications in multi-node environments.

My application is running since months now and I must say, that it worked just fine. BUT we never needed more than one instance (basic_scaling: max_instances: 1), so no sticky-sessions were needed and my conclusions might not be valid if one needs more than one instance:
both F- and B- instance-class worked fine - not a single switch has occurred in the past, no sessions were lost
if the instance needs to be re-started due to idle-timeout, this is just a matter of ~10 seconds which is ok for the test-instance. For production I would recommend manual scaling with 24 instance-hours per day (so the instance will never be re-started)
Session-affinity can be set in app.yaml according to https://cloud.google.com/appengine/docs/flexible/java/using-websockets-and-session-affinity. So it might as well work with multiple instances

Related

Costs generated by old versions of my appengine application

I have many versions of appengine instances that are active but not used because they are old. What are the costs they can generate? How do you behave with the old versions of appengine instances? Do you delete them or deactivate them?
On the documentation I don't find any reference to the costs of the old instances.
https://cloud.google.com/appengine/pricing?hl=it
UPDATE:
(GAE STANDARD)
Thank you
It's a poorly documented aspect of App Engine. What you describe as versions that are "not used" are more specifically versions that don't receive traffic. But depending on your scaling configuration (essentially defined in your app.yaml file), there may not be a 1:1 relationship between traffic and the number of active instances serving a version.
For example, I'm familiar with using "automatic_scaling" with min_instances = 1. This will prevent a service to scale to zero instance and have latency to serve an incoming request after some idle time, but it also means that any version until deleted will generate a baseline cost of 1 instance running 24/7.
Also, I've found that the estimated number of instances displayed in the dashboard you screenshotted can be misleading (more specifically, it can show 0 instance while there is actually one running).
Note that if you do not have scaling related configuration in your app.yaml file, you should check what are the default values currently considered by App Engine.
It's tricky when you get started and I'm sure I'm not the only one who lost most of the free trial budget because of this.
There is actually a limit of versions you can have depending on your app's pricing:
Limit
Free app
Paid app
Max versions
15
210
It seems that you can have them active in case you want to switch between versions migrating or splitting the traffic between them, but they won't charge you for them if you don't reach more than 15 versions.

GAE, am I still required to implement a load balancer?

Have a production web-application deployment on GAE (LAMP Stack) with autoscaled setting, and according to the documentation, Google will automatically spin-up additional instances to meet demand; this seemed to have been proven when we went live, hours before a season finale aired which would guarantee traffic hit our site, and our site did NOT fall-over even with the expected sizable influx - so kudos to Google! However, I'd be naive to think that this server architecture is done, knowing that we're still in our infancy, and we could potentially get 10 - 100x more traffic in the near future on a consistent basis when we gain popularity and move into the global market. So my question is:
Should I be implementing a Load Balancer in GCP or will GAE be able to scale "indefinitely" to accommodate?
Based on this answer: AppEngine load balancing across multiple regions you'll need to implement the load balancer if you're targeting multiple regions.
Otherwise it will be dependent on your configuration and the thresholds you've set on your GAE config.
According to https://cloud.google.com/appengine/docs/standard/go/how-instances-are-managed, there are three ways you can define scaling on your AppEngine instance:
Automatic scaling
Automatic scaling creates dynamic instances based on request rate, response latencies, and other application metrics. However, if you
specify a number of minimum idle instances, that specified number of
instances run as resident instances while any additional instances
are dynamic.
Basic Scaling
Basic scaling creates dynamic instances when your application receives requests. Each instance will be shut down when the app
becomes idle. Basic scaling is ideal for work that is intermittent or
driven by user activity.
Manual scaling
Manual scaling uses resident instances that continuously run the specified number of instances regardless of the load level. This
allows tasks such as complex initializations and applications that
rely on the state of the memory over time.
So the answer is it depends. You'll just need to base your scaling strategy on how your load distribution looks. I would expect that the automatic scaling is fine for 90% of early-stage websites, though that's just my impression.

Caching after GAE standard migration to Go 1.11/1.12

I've almost completed migrating based on google's instructions.
It's very nice to not have to call into the app-engine libraries whatsoever.
However, now I must replace my calls to app-engine-standard memcached.
Here's what the guide says: "To use a memcache service on App Engine, use Redis Labs Memcached Cloud instead of App Engine Memcache."
So is this my only option; a third party? They don't even list pricing on their page if GCE is selected.
I also see in the standard environment how-to guides there is a guide on Connecting to internal resources in a VPC network.
From that link it mentions Cloud Memorystore. I can't find any examples if this is advisable or possible to do on GAE standard. Of course it wasn't previously possible but now that GAE standard has become much more "standard", I think it should be possible?
Thanks for any advice on the best way forward.
Memorystore appears to be Google's replacement:
https://cloud.google.com/memorystore/
You connect to it using this guide:
https://cloud.google.com/appengine/docs/standard/go/using-memorystore
Alas it costs about $1.20/GB per day with no free quota.
Thus, if your data doesn't change, and requires less than 100MB of cache at a time, the first answer might be better (free). Also, your data won't explode the instance as you can control the max size of the cache.
However, if your data changes or you need more cache, MemoryStore is a more direct replacement to MemCache - just costs money.
I've been thinking about this. 2nd gen instances have twice the ram, so if global cache isn't required (as in items don't change once created - (name items using their sha256)), you can run your own local threadsafe memcache (such as https://github.com/dgraph-io/ristretto) and allocate some of the extra ram to it. It'll be faster than Memcache was, so requests can be serviced even faster, keeping the number of instances low.
You could make it global for data that does change, by using pub/sub between instances, but I think that's significantly more work.
To ease the migration to 1.12, I have been thinking of using this solution:
create a dedicated app using the 1.11 runtime.
setup twirp endpoints to act as a proxy for all the deprecated app engine services (memcache, mail, search...)

Fastest Open Source Content Management System for Cloud/Cluster deployment

Currently clouds are mushrooming like crazy and people start to deploy everything to the cloud including CMS systems, but so far I have not seen people that have succeeded in deploying popular CMS systems to a load balanced cluster in the cloud. Some performance hurdles seem to prevent standard open-source CMS systems to be deployed to the cloud like this.
CLOUD: A cloud, better load-balanced cluster, has at least one frontend-server, one network-connected(!) database-server and one cloud-storage server. This fits well to Amazon Beanstalk and Google Appengine. (This specifically excludes CMS on a single computer or Linux server with MySQL on the same "CPU".)
To deploy a standard CMS in such a load balanced cluster needs a cloud-ready CMS with the following characteristics:
The CMS must deal with the latency of queries to still be responsive and render pages in less than a second to be cached (or use a precaching strategy)
The filesystem probably must be connected to a remote storage (Amazon S3, Google cloudstorage, etc.)
Currently I know of python/django and Wordpress having middleware modules or plugins that can connect to cloud storages instead of a filesystem, but there might be other cloud-ready CMS implementations (Java, PHP, ?) and systems.
I myself have failed to deploy django-CMS to the cloud, finally due to query latency of the remote DB. So here is my question:
Did you deploy an open-source CMS that still performs well in rendering pages and backend admin? Please post your average page rendering access stats in microseconds for uncached pages.
IMPORTANT: Please describe your configuration, the problems you have encountered, which modules had to be optimized in the CMS to make it work, don't post simple "this works", contribute your experience and knowledge.
Such a CMS probably has to make fewer than 10 queries per page, if more, the queries must be made in parallel, and deal with filesystem access times of 100ms for a stat and query delays of 40ms.
Related:
Slow MySQL Remote Connection
Have you tried Umbraco?
It relies on database, but it keeps layers of cache so you arent doing selects on every request.
http://umbraco.com/azure
It works great on azure too!
I have found an excellent performance test of Wordpress on Appengine. It appears that Google has spent some time to optimize this system for load-balanced cluster and remote DB deployment:
http://www.syseleven.de/blog/4118/google-app-engine-php/
Scaling test from the report.
parallel
hits GAE 1&1 Sys11
1 1,5 2,6 8,5
10 9,8 8,5 69,4
100 14,9 - 146,1
Conclusion from the report the system is slower than on traditional hosting but scales much better.
http://developers.google.com/appengine/articles/wordpress
We have managed to deploy python django-CMS (www.django-cms.org) on GoogleAppEngine with CloudSQL as DB and CloudStore as Filesystem. Cloud store was attached by forking and fixing a django.storage module by Christos Kopanos http://github.com/locandy/django-google-cloud-storage
After that, the second set of problems came up as we discovered we had access times of up to 17s for a single page access. We have investigated this and found that easy-thumbnails 1.4 accessed the normal file system for mod_time requests while writing results to the store (rendering all thumb images on every request). We switched to the development version where that was already fixed.
Then we worked with SmileyChris to fix unnecessary access of mod_times (stat the file) on every request for every image by tracing and posting issues to http://github.com/SmileyChris/easy-thumbnails
This reduced access times from 12-17s to 4-6s per public page on the CMS basically eliminating all storage/"file"-system access. Once that was fixed, easy-thumbnails replaced (per design) file-system accesses with queries to the DB to check on every request if a thumbnail's source image has changed.
One thing for the web-designer: if she uses a image.width statement in the template this forces a ugly slow read on the "filesystem", because image widths are not cached.
Further investigation led to the conclusion that DB accesses are very costly, too and take about 40ms per roundtrip.
Up to now the deployment is unsuccessful mostly due to DB access times in the cloud leading to 4-5s delays on rendering a page before caching it.

Project suited for Google App Engine?

I'm currently developing a small hobby project (open sourced at https://github.com/grav/mailbum) which quite simply takes images from a Gmail account and puts them in albums on Picasa Web.
Since it's (currently) only dealing with Google-hosted data, I was thinking about hosting it on Google App Engine, but I'm not sure if it's well-suited for GAE:
Will the maximum execution time be a problem? It's currently 10 minutes according to http://googleappengine.blogspot.com/2010/12/happy-holidays-from-app-engine-team-140.html, but I'd think the tasks (i.e. processing a single mail) would be easy to run in parallel. I'm also guessing that dealing with Google-hosted data would be quite efficient on GAE?
Will the fact that it's written in Clojure be an obstacle? I've researched a bit in getting Clojure to run on GAE, but I've never tried it. Any pin-pointers?
Thanks for any advice and thoughts on the project!
It seems like your application is doable on GAE. My points of concern would be:
Does your code ever store the images that it is processing to temporary files? If so it will need to be changed to do everything in memory, because GAE applications are sandboxed and not allowed to write to the filesystem (if you need temporary persistent storage, you might be able to work something out where you write your file data to a BLOB field in the GAE datastore).
How do you get the images into Picasa Web? If they provide a simple REST/HTTP API then all is well. If you need something more involved than that (like a raw TCP socket) then it won't work.
The 10-minute execution time limit only applies to background tasks. When actually servicing web requests the time limit is 30 seconds. So if you provide a web-based interface to your app, you need to structure things so that the interface is just scheduling jobs that run in the background (i.e. you can't fire off a job directly as part of servicing a web request).
If none of those sound like show-stoppers to you, then I think your app should work just fine on GAE.
Can't really say if Clojure will work though. I have, however, spent time in the past getting some third-party libraries to work on App-Engine. Generally all I had to do was remove/modify/disable any parts of the library that accessed features that are forbidden by the sandbox (for instance, I had to disable the automatic caching to disk to get commons-fileupload to work on GAE). Not sure if the same would apply to Clojure, or even what the scope would be on a task like that.
I have been dabbling with Clojure and App Engine for a while now and I have to recommend appengine-magic. It abstracts most of the Java stuff away and is very easy to use. As a plus the project seems to be very active.

Resources