Google App Engine Global Variables(ServletContext) - google-app-engine

I'm trying to build a chat application on GAE in JAVA . I have the need to keep count of all online users and their networks (chat rooms of some sort) , and this info needs to be updated constantly .
I have (wrongly?) assumed that I can just use Java's SerlvetContext and Set/Get Attribute methods to update online\offline users and share that information with all servlets . As I have come to know(with lovely bugs) , since GAE is distributed\cloud service , it doesn't effectively implement ServletContext.setAttribute - meaning is that my app probably runs on more than one JVM , and information on ServletContext is shared only between servlets belonging to the same JVM.
This is a huge problem for me , of course .
Several Questions -
1)Does ServletContext indeed won't work properly on GAE?
2)Is GAE a bad choice for beginner web developpers like myself ? It seems to me that I always find new problems and things that don't correspond with Servlet\JSP rules. Since it is hard enough for a beginner to learn Servlets , maybe GAE isn't the right choice?
3)How then can I share information between Servlets ?

If you're really just trying to learn Java EE for your own purposes I would probably avoid GAE for the reasons you mention. It's a perfectly good service, but yes, it has its own set of caveats that might get in the way of your learning. You might be better off just spinning up an EC2 instance for your purposes.
That said - you are correct, AppEngine will spin up and down instances to serve requests. If you want shared state you should use memcache which is shared across instances, but you have to manage access to the memcache objects for the possibility of multiple users writing to it at the same time.

In google app engine, application state is usually shared between instances using the datastore. As your requirement is more real-time and might not behave nicely using polling you should use the Channel API (perhaps in addition to the datastore):
https://developers.google.com/appengine/docs/java/channel/
Quoting from that page:
The Channel API creates a persistent connection between your
application and Google servers, allowing your application to send
messages to JavaScript clients in real time without the use of
polling. This is useful for applications designed to update users
about new information immediately.

Related

When to choose App Engine over Cloud Functions?

Sorry, if this is a naive question, but i've watched bunch of talks from google's staff and still don't understand why on earth i would use AE instead of CF?
If i understood it correctly, the whole concept of both of these services is to build "microservice architecture".
both CF and AE are stateless
both suppose to execute during limited period of time
both can interact with dbs and other gcp apis.
Though, AE must be wrapped into own server. Basically it utilizes a lot of complexities on top of the same capabilities as CF. So, when should i use it instead of CF?
Cloud Functions (CFs) and Google App Engine (GAE) are different tools for different jobs. Using the right tool for the job is usually a good idea.
Driving a nail using pliers might be possible, but it won't be as convenient as using a hammer. Similarly building a complex app using CFs might be possible, but building it using GAE would definitely be more convenient.
CFs have several disadvantages compared to GAE (in the context of building more complex applications, of course):
they're limited to Node.js, Python, Go, Java, .NET Core, and Ruby. GAE supports several other popular programming languages
they're really designed for lightweight, standalone pieces of functionality, attempting to build complex applications using such components quickly becomes "awkward". Yes, the inter-relationship context for every individual request must be restored on GAE just as well, only GAE benefits from more convenient means of doing that which aren't available on CFs. For example user session management, as discussed in other comments
GAE apps have an app context that survives across individual requests, CFs don't have that. Such context makes access to certain Google services more efficient/performant (or even plain possible) for GAE apps, but not for CFs. For example memcached.
the availability of the app context for GAE apps can support more efficient/performant client libraries for other services which can't operate on CFs. For example accessing the datastore using the ndb client library (only available for standard env GAE python apps) can be more efficient/performant than using the generic datastore client library.
GAE can be more cost effective as it's "wholesale" priced (based on instance-hours, regardless of how many requests a particular instance serves) compared to "retail" pricing of CFs (where each invocation is charged separately)
response times might be typically shorter for GAE apps than CFs since typically the app instance handling the request is already running, thus:
the GAE app context doesn't need to be loaded/restored, it's already available, CFs need to load/restore it
(most of the time) the handling code is already loaded; CFs' code still needs to be loaded. Not too sure about this one; I guess it depends on the underlying implementation.
App Engine is better suited to applications, which have numerous pieces of functionality behaving in various inter-related (or even unrelated) ways, while cloud functions are more specifically single-purpose functions that respond to some event and perform some specific action.
App Engine offers numerous choices of language, and more management options, while cloud functions are limited in those areas.
You could easily replicate Cloud Functions on App Engine, but replicating a large scale App Engine application using a bunch of discrete Could Functions would be complicated. For example, the backend of Spotify is App Engine based.
Another way to put this is that for a significantly large application, starting with a more complex system like App Engine can lead to a codebase which is less complex, or at least, easier to manage or understand.
Ultimately these both run on similar underlying infrastructure at Google, and it's up to you to decide which one works for the task at hand. Furthermore, There is nothing stopping you from mixing elements of both in a single project.
Google Cloud Functions are simple , single purpose functions which are fired while watching event(s).
These function will remove need to build your own application servers to handle light weight APIs.
Main use cases :
Data processing / ETL : Listen and respond to Cloud Storage events, e.g. File created , changed or removed )
Webhooks : Via a simple HTTP trigger, respond to events originating from 3rd party systems like GitHub)
Lightweight APIs : Compose applications from lightweight, loosely coupled bits of logic
Mobile backend: Listen and respond to events from Firebase Analytics, Realtime Database, Authentication, and Storage
IoT: Thousands of devices streaming events and which in-turn calls google cloud functions to transform and store data
App Engine is meant for building highly scalable applications on a fully managed serverless platform. It will help you to focus more on code. Infrastructure and security will be provided by AE
It will support many popular programming languages. You can bring any framework to app engine by supplying docker container.
Use cases:
Modern web application to quickly reach customers with zero config deployment and zero server management.
Scalable mobile backends : Seamless integration with Firebase provides an easy-to-use frontend mobile platform along with the scalable and reliable back end.
Refer to official documentation pages of Cloud functions and App Engine
As both Cloud Functions and App Engine are serverless services, this is what I feel.
For Microservices - We can go either with CF's or App Engine. I prefer CF's though.
For Monolithic Apps - App engine suits well.
Main differentiator as #Cameron points out, is that cloud functions reliably respond to events. E.g. if you want to execute a script on a change in a cloud storage bucket, there is a dedicated trigger for cloud functions. Replicating this logic would be much more cumbersome in GAE. Same for Firestore collection changes.
Additionally, GAE’s B-machines (backend machines for basic or manual scaling) have conveniently longer run times of up to 24 hours. Cloud functions currently only run for 9 minutes top. Further, GAE allows you to encapsulate cron jobs as yamls next to your application code. This makes developing a server less event driven service much more clean.
Of course, the other answers covered these aspects better than mine. But I wanted to point out the main advantages of Cloud Functions being the trigger options. If you want functions or services to communicate with each other, GAE is probably the better choice.

share datastore between GAE (ndb) and GCE (gcloud.datastore) in test environment

I have an Python appengine app using datastore with ndb API and I want to do background work and store results into the datastore so appengine uses it.
I wanted to use GCE or my computer to do so, but ndb API is not available outside appengine and the alternative seems to be gcloud.datastore API which is very different.
How do you guarantee that what you push (with gcloud API) is consistent with what you get (ie: matches a ndb entity) ?
I can't do unit-tests because the local server is not the same (gcd vs dev_appserver). Here is a workaround (but in Java).
Should I replace ndb code by gcloud.datastore in appengine to ensure consistency (but loosing ndb advantages like auto caching...) ?
Is there an obvious solution I'm missing ? If someone had the same issue, how did you handle it ?
Thanks
If you're really worried about consistency and you're using fancy features of ndb, you should probably look into using the Remote API for App Engine, which effectively let's you run arbitrary code via a remote (HTTP) interface. It might help you get the job done, but remember that the CPU cycles you're using will be in GAE -- not in GCE.
If you're willing to wait a while, we're working on porting the ndb API to run against the Cloud Datastore API which would mean the same code you run in App Engine will work outside of App Engine (on your local machine or inside Google Compute Engine).
The gcloud.datastore (gcloud-python) API is much more low-level, so you should have even more control over the data that ends up in the Datastore. It wasn't built to be identical to ndb (and therefore doesn't have some of the fancy stuff like derived fields or geo-points as first-class citizens), however ndb stores those fields using it's own Python logic so you should be able to write safely if you're comfortable with the lower-level data representations.

Efficient web services using AppEngine

I'm trying to use AppEngine as sort of a RESTful web service. The service is supposed to do simple finds and puts from the Datastore so Objectify seems good for covering that part. It also does a few lookups to other services if data is not available in the Datastore'. I'm usingRedstone XMLRPC` for that part.
Now, I have a couple of questions about how to design the serving part in view of AppEngine' quotas (I guess one should think about efficiency in most case but AppEngine's billing make more people think about efficiency).
First lets consider I use simple Servlets. In this case, I see two options. Either I create a number of servlets each providing a different service with Json passed to each of them or I use a single (or a fewer number of) service and determine the action to perform based on a parameter passed with the Json. Will either design have any significance on the number of hours, etc. clocked by AppEngine?
What is the cost difference if I use a RESTful framework such as Restlet or RestEasy as opposed to the barebones approach ?
This question is something of a follow up to : Creating Java Web Service using Google AppEngine
It's not so important, because most costs are going to datastore, so frontend micro-optimisation doesn't matter.
You can save there may be few cents, by choosing 'simple servet', but... is it your goal? It's much more important to make good data structures, prepare all required data in background, make good caching strategy, etc.
I agree with #Igor.
However, there is an additional thing to consider: http sessions.
GAE supports http sessions. Since GAE is a distributed system, sessions are stored in Datastore (and cached in Memcache for efficient reading). Session is updated in every request (to support expiration), so on every request Datastore is accessed.
Sessions are not required for REST and should be turned off.

Mixing aws and app engine

We are starting a new project that requires two main components:
Backend for task management, e.g retrieve a task from a queue and according to some specific logic validate it.
Run a real compiler on that specific task and create an executable that an end user should receive.
We love app engine, however the second part will require a concrete instance where an actual compiler will have to be installed, app engine is not capable here. We were thinking to mix both app engine and aws instances to accomplish the task (part 1 will be app engine and part 2 will be aws).
All of our senses say it's a bad idea:
unneeded traffic between the two providers, someone needs to pay for that unfortunately.
We'll have to deal with two systems, two deployments process, each system has its own quirks --> double the work.
But we love app engine.
Does anyone has any experience in combining the two systems? any recommendations ?
There's no reason why what you suggest won't work, especially if you separate your concerns well, by exposing a clean 'compiler' interface on AWS or a similar service. Yes, you will have to pay for traffic between the two services, but this is unlikely to be substantial. If you are serving up the end result to the user, you can link them directly to AWS, rather than fetching it with your app first.
AWS's EC2s are literally just vanilla linux boxes in the sky. I would also throw out the suggestion of just moving to it completely. Porting your system over may be easier than it sounds if you're unix savvy.

Is Google App Engine suitable for near-realtime event-driven application?

We're going to develop a near-realtime event-driven application (backend and bunch of mobile clients).
I think Akka (http://akka.io) is more than suitable for this. However, my collegaue wants to use Google App Engine and its async features. I'm not convinced it's the best approach to take, I wonder if we can somehow meld those two things together. I can't find any solid contemporary info via Google.
The Channel API may be useful. However, the main limitation you may face using app engine are transactional writes to the datastore, because an entity group (parent entity and its children) can only support one to ten writes per second.
It is notable that the new Go support for app engine supports actor-style programming with goroutines. When datastore or other ops block then other goroutines run. It would be nice if someone could do this for scala and one of the actor variants. The new backend system allows this style to be used in a long running way I presume.
Unrelatedly, on the issue of writing to an entity group. I write a record that might be already there (same key_name), and now I'm wondering if I should read it first to check.

Resources