We're going to develop a near-realtime event-driven application (backend and bunch of mobile clients).
I think Akka (http://akka.io) is more than suitable for this. However, my collegaue wants to use Google App Engine and its async features. I'm not convinced it's the best approach to take, I wonder if we can somehow meld those two things together. I can't find any solid contemporary info via Google.
The Channel API may be useful. However, the main limitation you may face using app engine are transactional writes to the datastore, because an entity group (parent entity and its children) can only support one to ten writes per second.
It is notable that the new Go support for app engine supports actor-style programming with goroutines. When datastore or other ops block then other goroutines run. It would be nice if someone could do this for scala and one of the actor variants. The new backend system allows this style to be used in a long running way I presume.
Unrelatedly, on the issue of writing to an entity group. I write a record that might be already there (same key_name), and now I'm wondering if I should read it first to check.
Related
Sorry, if this is a naive question, but i've watched bunch of talks from google's staff and still don't understand why on earth i would use AE instead of CF?
If i understood it correctly, the whole concept of both of these services is to build "microservice architecture".
both CF and AE are stateless
both suppose to execute during limited period of time
both can interact with dbs and other gcp apis.
Though, AE must be wrapped into own server. Basically it utilizes a lot of complexities on top of the same capabilities as CF. So, when should i use it instead of CF?
Cloud Functions (CFs) and Google App Engine (GAE) are different tools for different jobs. Using the right tool for the job is usually a good idea.
Driving a nail using pliers might be possible, but it won't be as convenient as using a hammer. Similarly building a complex app using CFs might be possible, but building it using GAE would definitely be more convenient.
CFs have several disadvantages compared to GAE (in the context of building more complex applications, of course):
they're limited to Node.js, Python, Go, Java, .NET Core, and Ruby. GAE supports several other popular programming languages
they're really designed for lightweight, standalone pieces of functionality, attempting to build complex applications using such components quickly becomes "awkward". Yes, the inter-relationship context for every individual request must be restored on GAE just as well, only GAE benefits from more convenient means of doing that which aren't available on CFs. For example user session management, as discussed in other comments
GAE apps have an app context that survives across individual requests, CFs don't have that. Such context makes access to certain Google services more efficient/performant (or even plain possible) for GAE apps, but not for CFs. For example memcached.
the availability of the app context for GAE apps can support more efficient/performant client libraries for other services which can't operate on CFs. For example accessing the datastore using the ndb client library (only available for standard env GAE python apps) can be more efficient/performant than using the generic datastore client library.
GAE can be more cost effective as it's "wholesale" priced (based on instance-hours, regardless of how many requests a particular instance serves) compared to "retail" pricing of CFs (where each invocation is charged separately)
response times might be typically shorter for GAE apps than CFs since typically the app instance handling the request is already running, thus:
the GAE app context doesn't need to be loaded/restored, it's already available, CFs need to load/restore it
(most of the time) the handling code is already loaded; CFs' code still needs to be loaded. Not too sure about this one; I guess it depends on the underlying implementation.
App Engine is better suited to applications, which have numerous pieces of functionality behaving in various inter-related (or even unrelated) ways, while cloud functions are more specifically single-purpose functions that respond to some event and perform some specific action.
App Engine offers numerous choices of language, and more management options, while cloud functions are limited in those areas.
You could easily replicate Cloud Functions on App Engine, but replicating a large scale App Engine application using a bunch of discrete Could Functions would be complicated. For example, the backend of Spotify is App Engine based.
Another way to put this is that for a significantly large application, starting with a more complex system like App Engine can lead to a codebase which is less complex, or at least, easier to manage or understand.
Ultimately these both run on similar underlying infrastructure at Google, and it's up to you to decide which one works for the task at hand. Furthermore, There is nothing stopping you from mixing elements of both in a single project.
Google Cloud Functions are simple , single purpose functions which are fired while watching event(s).
These function will remove need to build your own application servers to handle light weight APIs.
Main use cases :
Data processing / ETL : Listen and respond to Cloud Storage events, e.g. File created , changed or removed )
Webhooks : Via a simple HTTP trigger, respond to events originating from 3rd party systems like GitHub)
Lightweight APIs : Compose applications from lightweight, loosely coupled bits of logic
Mobile backend: Listen and respond to events from Firebase Analytics, Realtime Database, Authentication, and Storage
IoT: Thousands of devices streaming events and which in-turn calls google cloud functions to transform and store data
App Engine is meant for building highly scalable applications on a fully managed serverless platform. It will help you to focus more on code. Infrastructure and security will be provided by AE
It will support many popular programming languages. You can bring any framework to app engine by supplying docker container.
Use cases:
Modern web application to quickly reach customers with zero config deployment and zero server management.
Scalable mobile backends : Seamless integration with Firebase provides an easy-to-use frontend mobile platform along with the scalable and reliable back end.
Refer to official documentation pages of Cloud functions and App Engine
As both Cloud Functions and App Engine are serverless services, this is what I feel.
For Microservices - We can go either with CF's or App Engine. I prefer CF's though.
For Monolithic Apps - App engine suits well.
Main differentiator as #Cameron points out, is that cloud functions reliably respond to events. E.g. if you want to execute a script on a change in a cloud storage bucket, there is a dedicated trigger for cloud functions. Replicating this logic would be much more cumbersome in GAE. Same for Firestore collection changes.
Additionally, GAE’s B-machines (backend machines for basic or manual scaling) have conveniently longer run times of up to 24 hours. Cloud functions currently only run for 9 minutes top. Further, GAE allows you to encapsulate cron jobs as yamls next to your application code. This makes developing a server less event driven service much more clean.
Of course, the other answers covered these aspects better than mine. But I wanted to point out the main advantages of Cloud Functions being the trigger options. If you want functions or services to communicate with each other, GAE is probably the better choice.
Please check the answer and comments of my previous question in order to get a better understanding of my situation. If I use Google DataStore on AppEngine, my application will be tightly coupled and hence loose portability.
I'm working on Android and will be using backend which will reside in the cloud. I need client-cloud communication. How do I build an application maintaining portability. What design patterns, architectural patterns should I be using?
Should I use a broker pattern? I'm perplexed.
Google AppEngine provides JPA based interfaces for its datastore. As long as you are writing your code using JPA APIs, it will be easy to port the same to other datastores (Hibernate for example also implements JPA).
I would ensure that the vendor specific code doesn't percolate beyond a thin layer that sits just above the vendor's APIs. That would ensure that when I have to move to a different vendor, I know exactly which part of code would be impacted.
It u really want to avoid portability issues use google cloud sql instead. If u use the datastore unless its a trivial strucfure you sill not be able to trivially port it eve if you use pure jpa/jdo, because those were really not meant for nosql. Google has particularifies with indexes etc.
Of course sql is more expensive and has size limits
In order to maintain portability for my application, I've chosen Restlet, which offers Restful web apis, over endpoints. Restlet would help me to communicate between server and client.
Moreover, it would not get my application locked in to a particular vendor.
I'm trying to build a chat application on GAE in JAVA . I have the need to keep count of all online users and their networks (chat rooms of some sort) , and this info needs to be updated constantly .
I have (wrongly?) assumed that I can just use Java's SerlvetContext and Set/Get Attribute methods to update online\offline users and share that information with all servlets . As I have come to know(with lovely bugs) , since GAE is distributed\cloud service , it doesn't effectively implement ServletContext.setAttribute - meaning is that my app probably runs on more than one JVM , and information on ServletContext is shared only between servlets belonging to the same JVM.
This is a huge problem for me , of course .
Several Questions -
1)Does ServletContext indeed won't work properly on GAE?
2)Is GAE a bad choice for beginner web developpers like myself ? It seems to me that I always find new problems and things that don't correspond with Servlet\JSP rules. Since it is hard enough for a beginner to learn Servlets , maybe GAE isn't the right choice?
3)How then can I share information between Servlets ?
If you're really just trying to learn Java EE for your own purposes I would probably avoid GAE for the reasons you mention. It's a perfectly good service, but yes, it has its own set of caveats that might get in the way of your learning. You might be better off just spinning up an EC2 instance for your purposes.
That said - you are correct, AppEngine will spin up and down instances to serve requests. If you want shared state you should use memcache which is shared across instances, but you have to manage access to the memcache objects for the possibility of multiple users writing to it at the same time.
In google app engine, application state is usually shared between instances using the datastore. As your requirement is more real-time and might not behave nicely using polling you should use the Channel API (perhaps in addition to the datastore):
https://developers.google.com/appengine/docs/java/channel/
Quoting from that page:
The Channel API creates a persistent connection between your
application and Google servers, allowing your application to send
messages to JavaScript clients in real time without the use of
polling. This is useful for applications designed to update users
about new information immediately.
We are starting a new project that requires two main components:
Backend for task management, e.g retrieve a task from a queue and according to some specific logic validate it.
Run a real compiler on that specific task and create an executable that an end user should receive.
We love app engine, however the second part will require a concrete instance where an actual compiler will have to be installed, app engine is not capable here. We were thinking to mix both app engine and aws instances to accomplish the task (part 1 will be app engine and part 2 will be aws).
All of our senses say it's a bad idea:
unneeded traffic between the two providers, someone needs to pay for that unfortunately.
We'll have to deal with two systems, two deployments process, each system has its own quirks --> double the work.
But we love app engine.
Does anyone has any experience in combining the two systems? any recommendations ?
There's no reason why what you suggest won't work, especially if you separate your concerns well, by exposing a clean 'compiler' interface on AWS or a similar service. Yes, you will have to pay for traffic between the two services, but this is unlikely to be substantial. If you are serving up the end result to the user, you can link them directly to AWS, rather than fetching it with your app first.
AWS's EC2s are literally just vanilla linux boxes in the sky. I would also throw out the suggestion of just moving to it completely. Porting your system over may be easier than it sounds if you're unix savvy.
I am trying to scrape some website and republish the data as a RSS feed. How hard is this to setup with Google App Engine? Disadvantages and Advantages using GAE. Any recommendations and guidelines greatly appreciated!
Google AppEngine offers much more functionality (and complexity) than you will need if truly all you will want to do is republish some structured data as RSS.
Personally, I would use something like Yahoo pipes for a task like this.
That being said... if you want/need to get your feet wet with GAE, go for it!
Working with Google App Engine is pretty straight forward. I would recommend going through the Getting Started guide. It's short and simple and touches on essential GAE topics. There are more pros and cons than I will list here.
Pros:
In general, App Engine is designed for high traffic web applications that need to scale. Furthermore, it is designed from a programmer's perspective. Much of the scalability issues (database optimization, server administration, etc) are dealt with by Google. Having said that, I find it to be a nice platform. It is still being actively developed by Google engineers, and scheduling of tasks (a feature that has been long requested) is in the current road map.
Cons:
Perhaps the biggest downside right now is again the lack of official scheduling support and the quota limits currently set for free accounts. However you can't complain much if its free. Currently it only supports Python as a programming interface (although a new language [Java I predict] is coming soon). Furthermore, Python 2.6 (and 3.0 for that matter) are not yet supported. In addition, Django 1.0 is not officially supported in App Engine (although you can package Django 1.0 with your application).
Harder than it would be in most other technologies.
GAE can sort of do scheduled batch stuff like this now, but it's really not intended for that type of thing. Pick pretty much any other language and platform for this particular task, and you'll make your life a lot easier.
I think BeautifulSoup could run on GAE, so all your scraping needs are handled :D
Also, GAE has a geturl thingy. The only problem I think you might have is not having enough time to get the data (30 secs limitation).
I am working on a same project and I've decided that it's easier to prepare the data on another server and push them to GAE.
You might also want to look into Yahoo! Query Language (YQL)