Efficient web services using AppEngine - google-app-engine

I'm trying to use AppEngine as sort of a RESTful web service. The service is supposed to do simple finds and puts from the Datastore so Objectify seems good for covering that part. It also does a few lookups to other services if data is not available in the Datastore'. I'm usingRedstone XMLRPC` for that part.
Now, I have a couple of questions about how to design the serving part in view of AppEngine' quotas (I guess one should think about efficiency in most case but AppEngine's billing make more people think about efficiency).
First lets consider I use simple Servlets. In this case, I see two options. Either I create a number of servlets each providing a different service with Json passed to each of them or I use a single (or a fewer number of) service and determine the action to perform based on a parameter passed with the Json. Will either design have any significance on the number of hours, etc. clocked by AppEngine?
What is the cost difference if I use a RESTful framework such as Restlet or RestEasy as opposed to the barebones approach ?
This question is something of a follow up to : Creating Java Web Service using Google AppEngine

It's not so important, because most costs are going to datastore, so frontend micro-optimisation doesn't matter.
You can save there may be few cents, by choosing 'simple servet', but... is it your goal? It's much more important to make good data structures, prepare all required data in background, make good caching strategy, etc.

I agree with #Igor.
However, there is an additional thing to consider: http sessions.
GAE supports http sessions. Since GAE is a distributed system, sessions are stored in Datastore (and cached in Memcache for efficient reading). Session is updated in every request (to support expiration), so on every request Datastore is accessed.
Sessions are not required for REST and should be turned off.

Related

Google App Engine Global Variables(ServletContext)

I'm trying to build a chat application on GAE in JAVA . I have the need to keep count of all online users and their networks (chat rooms of some sort) , and this info needs to be updated constantly .
I have (wrongly?) assumed that I can just use Java's SerlvetContext and Set/Get Attribute methods to update online\offline users and share that information with all servlets . As I have come to know(with lovely bugs) , since GAE is distributed\cloud service , it doesn't effectively implement ServletContext.setAttribute - meaning is that my app probably runs on more than one JVM , and information on ServletContext is shared only between servlets belonging to the same JVM.
This is a huge problem for me , of course .
Several Questions -
1)Does ServletContext indeed won't work properly on GAE?
2)Is GAE a bad choice for beginner web developpers like myself ? It seems to me that I always find new problems and things that don't correspond with Servlet\JSP rules. Since it is hard enough for a beginner to learn Servlets , maybe GAE isn't the right choice?
3)How then can I share information between Servlets ?
If you're really just trying to learn Java EE for your own purposes I would probably avoid GAE for the reasons you mention. It's a perfectly good service, but yes, it has its own set of caveats that might get in the way of your learning. You might be better off just spinning up an EC2 instance for your purposes.
That said - you are correct, AppEngine will spin up and down instances to serve requests. If you want shared state you should use memcache which is shared across instances, but you have to manage access to the memcache objects for the possibility of multiple users writing to it at the same time.
In google app engine, application state is usually shared between instances using the datastore. As your requirement is more real-time and might not behave nicely using polling you should use the Channel API (perhaps in addition to the datastore):
https://developers.google.com/appengine/docs/java/channel/
Quoting from that page:
The Channel API creates a persistent connection between your
application and Google servers, allowing your application to send
messages to JavaScript clients in real time without the use of
polling. This is useful for applications designed to update users
about new information immediately.

Sharing memory-based data in Google App Engine

I'm loosely considering using Google App Engine for some Java server hosting, however I've come across what seems to be a bit of a problem whilst reading some of the docs. Most servers that I've ever written, and certainly the one I have in mind, require some form of memory-based storage that persists between sessions, however GAE seems to provide no mechanism for this.
Data can be stored as static objects but an app may use multiple servers and the data cannot be shared between the servers.
There is memcache, which is shared, but since this is a cache it is not reliable.
This leaves only the datastore, which would work perfectly, but is far too slow.
What I actually need is a high performance (ie. memory-based) store that is accessible to, and consistent for, all client requests. In this case it is to provide a specialized locking and synchronization mechanism that sits in front of the datastore.
It seems to me that there is a big gap in the functionality here. Or maybe I am missing something?
Any ideas or suggestions?
Static data (data you upload along with your app) is visible, read-only, to all instances.
To share data between instances, use the datastore. Where low-latency is important, cache in memcache. Those are the basic options. Reading out of the datastore is pretty fast, it's only writes you'll need to concern yourself with, and those can be mitigated by making sure that any entity properties that you don't need to query against are unindexed.
Another option, if it fits your budget, is to run your own cache in an always-on backend server.

Google App Engine Servlet Design

I have built a server on GAE that handles 6 different types requests over HTTP POST, all of which involve either creating, updating, or deleting objects from the datastore. What is the best design for this? I will tell you my current design and express a couple others.
My current design has all requests sent to the same servlet, and uses an "action" parameter as part of the POST to distinguish and handle the different requests. Code the server should run is included here.
e.g.
public void doPost(HttpServletRequest request, HttpServletResponse response) {
if (request.getParameter("action").equals("action_1")) {..code..}
if (request.getParameter("action").equals("action_2")) {..code..}
.
.
.
if (request.getParameter("action").equals("action_n")) {..code..}
}
2._Similar to above, but instead of the code here, this servlet just acts as a centralized servlet and calls a dedicated servlet for that action.
3._Have just a dedicated servlet for each action.
What are the pros and cons to the above designs and what is the preferred way to setup a server on GAE? Does accessing the datastore have an impact on my design?
I am in a similar situation. I started out with your option 1, which works fine. The only problem is it requires a lot of argument parsing, converting strings to integers and whatnot, as well as a manual mapping of command names to methods. Options 2 and 3 are equally laborious, but even worse because you have to create a bunch of auxiliary methods. If I had to do it all over again I would use a library that does all that work for me, like this one (I am in fact considering converting to this): http://code.google.com/p/json-rpc/. Voila, no argument parsing or manual creation of helper classes! This one happens to implement a json rpc client-server interface which is good if you are doing an ajax "thick client." If you are generating most of your HTML on the server side you might want another solution.
I have built a server on GAE that handles 6 different types requests over
HTTP POST, all of which involve either creating, updating, or deleting objects
from the datastore. What is the best design for this?
It sounds like a job for a web service. My favorite is REST (though with REST actions are usually mapped to URLs not parameters). Take a look at Resteasy docs.
Since they all do separate things, use separate servlets. There's no point combining them into a single servlet: It makes both your code and the URL mapping messier.
Too many servlets can lead to slow class loading time (cold startups) in GAE environment, but too few can lead to request contention, leading to poor performance due to high latency. So there is a trade-off.
Workaround that should be considered is to enable "always on" and "warm-up request" features, and make your servlet multithread-safe.

Can five different GAE sites all share a common datastore?

In addition to the datastore for your specific site, can you also share one datastore between all your websites? (Like connecting to a different MySQL database from your main MySQL database?)
Not really.
Two workarounds:
Use five "versions" of the same app instead of five different apps. They would share the same data store. The sites they power need not look alike at all (except sharing the domain).
Make the data store web-accessible by enabling the remote_api. It is up to you to configure the security for this, and performance is not likely to be great. Also, at the moment, the client-side remote_api is only available for Python (the server-side works on Java, too, though).
Short answer: no, your application has one and only one datastore, and it is completely segregated from every other application's datastore.
Longer answer: if you had an external datastore of some variety that was web-accessible, you could access it using urlfetch, but there is no way to access more than one AppEngine datastore using the datastore API.
RESTful services between the apps could be expensive and an alternative could be to use one multitenancy app for many client domains or namespaces to partition your data.

Google Web Toolkit (GWT) + Google App Engine (GAE) + Detached Data Persistence

I would like to develop a web-app requiring data persistence using GWT and GAE. As I understand it, my only (or at least by far the most convenient) option for data persistence is GAE's Datastore, using JDO or JPA annotated objects. I would also like to be able to send my objects back and forth client-server using GWT Remote Procedure Calls (RPC), therefore my objects must be able to "detach". However, GWT RPC serialization cannot handle detached JDO/JPA objects and it doesn't appear as though it will in the near future.
My question: what is the simplest and most direct solution to this? Being able to share the same objects client/server with server-side persistence would be extremely convenient.
EDIT
I should clarify that I still wish to use GWT RPC with GAE's Datastore. I am just looking for the best solution that would allow all these technologies to work together.
Try use http://gilead.sourceforge.net/
I've recently found Objectify, which is designed to be a replacement for JDO. Not much experience with it yet but its simpler to use than JDO, seems more lightweight, and claims to get around the need for DTOs with GWT, though I haven't tried that particular feature yet.
Ray Cromwell has a temporary hack up. I've tried it, and it works.
It forces you to use Transient instead of Detachable entities, because GWT can't serialize a hidden Object[] used by DataNucleus; This means that the objects you send to the client can't be inserted back into the datastore, you must retrieve the actual datastore object, and copy all the persistent fields back into it. Ray's method uses reflection to iterate over the methods, retrieve the getBean() and setBean() methods, and apply the entity setBean() with your transient gwt object's getBean().
You should strive to use JDO, the JPA isn't much more than a wrapper class for now. To use this hack, you must have both getter and setter methods for every persistent field, using PROPER getBean and setBean syntax for every "bean" field. Well, ALMOST PROPER, as it assumes all getters will start with "get", when the default boolean field use is "is".
I've fixed this issue and posted a comment on Ray's blog, but it's awaiting approval and I'm not sure if he'll post it. Basically, I implemented a #GetterPrefix(prefix=MethodPrefix.IS) annotation in the org.datanucleus package to augment his work.
In case it doesn't get posted, and this is an issue, email x_AT_aiyx_DOT_info Re: #GetterPrefix for JDO and I'll send you the fix.
Awhile ago I wrote a post Using an ORM or plain SQL?
This came up last year in a GWT
application I was writing. Lots of
translation from EclipseLink to
presentation objects in the service
implementation. If we were using
ibatis it would've been far simpler to
create the appropriate objects with
ibatis and then pass them all the way
up and down the stack. Some purists
might argue this is Badâ„¢. Maybe so (in
theory) but I tell you what: it
would've led to simpler code, a
simpler stack and more productivity.
which basically matches your observation.
But of course that isn't an option with Google App Engine so you're pretty much stuck having a translation layer between client-side objects and your JPA entities.
JPA entities are quite rigid so they're not really appropriate for sending back and forth between the client anyway. Typically you want little bits from several entities when doing this (thus ending up with some sort of presentation-layer value object). That is your path forward.
Try this. It is a module for serializing GAE core types and send them to the GWT client.
You can consider using JSON. GWT has necessary API to parse & generate JSON string in the client side. You get a lot of JSON API for server side. I tried with google-gson, which is fine. It converts your JSON string to POJO model and viceversa. Hope this helps you providing a decent solution for your requirement
Currently, I use the DTO (DataTransferObject) pattern. Not necessarily as clean and plenty more boilerplate but GAE still requires a fair amount of boilerplate at current. ;)
I have a Domain Object mapped (usually) one-to-one with a DTO. When a client needs Domain info, a DAO(DataAccessObject) coughs up a DTO representation of the Domain object and sends that across the wire. When a DTO comes back, I hand the DAO the DTO which then updates all the appropriate Domain Objects.
Not as clean as being able to pass Domain Objects directly across the wire obviously but the limitations of GAE's JDO implementation and GWT's Serialization process means this is the cleanest way for me to handle this currently.
I believe Google's official answer for this is GWT 2.1 RequestFactory.
Given that you are using GWT and GAE, I'd suggest you stick to the official Google framework... I have a similar GWT / GAE based app and that's what I am doing.
By the way, setting up RequestFactory is a bit of pain in the ass. The current Eclipse plug-in doesn't include all the jars but I was able to find the help I needed, in Stackoverflow
I've been using Objectify as well, and I really like it. You still have to do some dancing around with pre/postLoad methods to translate e.g. Text to String and back.
since GWT ultimately compiles to JavaScript, for detached persistence it would need one of a few services available. the best known are HTML5 stores and Gears (both use SQLite!). of course, neither is widely deployed, so you'd have to convince your users to either use a modern browser or install a little-known plugin. be sure to degrade to a usable subset if the user doesn't comply
What about directly using Datastore API to load/store POJO domain objects?
It should be comparable to DTO approach, meaning e.g. that you have to manually handle all fields (if you don't use tricks like reflection-based automation) while it should give you more flexibility and full access to all Datastore features.

Resources