Final GAE vs AWS architectural decision - google-app-engine

I know this has been asked one way or another before, but most of the main issues to do with GAE stability seem to have been asked around the end of 2008, early 2009, or aren't directly related to games at scale (which I'm interested in).
Basically, I have been arguing back and forth with my business partner about whether to use GAE or AWS for the back-end of our social game engine, and now it's crunch time. I love GAE (Java) for so many reasons, and although it used to be unstable, it's pretty good now. The main argument in favour of AWS is the fact that AWS has proven itself with multiple games running tens of millions of active users per day. The obvious pin-up child for AWS is Zynga, with its Farmville peaking at 80+million DAU. And that's just one of the hugely successful games running on the AWS infrastructure. Remarkable achievement.
So, one way or another it's KNOWN to work. GAE on the other hand doesn't have any examples that I could find doing these sorts of numbers. Not even close. So can I trust it? Is there a single example of a large social game with 2 million+ Daily Active Users, using GAE?
The main considerations for our social game back-end are:
Reliable CDN (Amazon CloudFront/S3 is excellent for this, as is Google's obviously excellent DataStore).
Ability to scale without falling over (AWS-EC2 is proven here, GAE doesn't seem to have examples of large game apps which can run into the 1000s of requests per second. GAE used to be quite unstable in this regard and so is my main concern).
Reliable no-SQL database. (AWS-SimpleDB and Google's DataStore are both excellent for this. We really don't need SQL).
Support/someone to call/contact if there is a problem. (This is one of the biggest worries with GAE. I have no idea who I can call, or if it's even possible. AWS has an SLA and support.)
I look forward to your thoughts, but please also note, this is not intended to start any sort of flame war. I love both systems, but both have their positives and negatives, but I'm about to make an architectural decision that likely won't be undone moving forward.
Regards,
Shane

I've never worked with AWS-EC2 so I'm going to share my knowledge just on the Google App Engine side.
Google App Engine is not meant to be a CDN; though it can serve static content through its powerful infrastructure providing caching close to the users, it does not guarantee the same kind of high quality and high availability service of a real CDN because it's not part of its duties.
Further data:
Maximum size of a file using the BlobStore service: 2 Gigabytes
Maximum size of a static file: 10 Megabytes
Currently App Engine always returns 200 status for static files even on Conditional gets (you have to rely on third party caching library like cirruxcache for example).
Recently Google App Engine team has shut down the App Gallery for one simple reason: too many Toy Apps!
Google wants to counteract this tendency showing successful businesses case studies; here are some of them:
BuddyPoke (viral Facebook app with 65 million installs)
WalkScore (serves 3 million request a day to thousands of real estate partner sites)
Webfillings
Snapabug
Optimizely
Ubisoft Facebook TikTok game
Other interesting case studies here
"We are well aware of downtimes and reliability issues, and are working hard to solve them: Improving App Engine reliability is our number one priority" was recently said by a Google Developer Relations Manager here.
App Engine is still in beta and is an evolving platform so you have to be prepared to deal with downtimes and issues.
Google App Engine team has just launched a preview of App Engine for Business providing 99.9% uptime service level agreement and premium developer support available.
Here is my opinion for what it's worth:
I'm aware that it's a tough call; having read a lot of articles about GAE I have mixed feelings about it because you can go from the recent catastrophic Carlos Ble report to the happy experience of Flower Garden or Gri.pe.
App Engine for Business looks promising and I would consider it in the case of a serious business project plan.
The fresh SDK 1.4.0 is huge and it clearly shows that the Team is really pushing hard to fix some annoying issues (Warmup requests) and relaxing some limitations (10 minutes process on TaskQueue).
Last thing to consider: if you are going to have big numbers, the Google App Engine Team will probably take your app as a successfull case study to follow with a boost of free and powerful Hype.

BuddyPoke is one example of a large-scale social app running on GAE. How large I'm not sure. This article says 30m daily page views (not users):
http://googleappengine.blogspot.com/2008/10/app-engine-case-studies.html
Their facebook page says 2.7 million monthly (not daily) users:
http://www.facebook.com/buddypoke
Although, they are also on a heap of other social networks:
http://www.buddypoke.com/
Personally I decided to go with GAE, for a couple of main reasons:
The unit of scalability is a single request, not a whole instance like it is with AWS.
I can work at a higher level, without having to worry about configuring instances.
If your point 4 is a big one for you, then you may be better off with AWS. With GAE there appears to be nothing you can do, and no-one you can contact.
About a week ago I had an issue with my app - it had suddenly started failing in Google's code, in a location which had been working fine for the last 5 days, ie since I had last uploaded my app. The only way to report issues to Google seems to be via their production issue template, here:
http://code.google.com/p/googleappengine/issues/entry?template=Production%20issue
I reported the issue, and didn't hear anything. Since it's running on Google's servers I was unable to resort to any 'usual' emergency tactics like restarting a server. An hour later and the problem resolved itself - I'm not sure if someone at Google saw my message and fixed something, or if it just went away. I updated my bug report to say the problem was fixed, but even now a week later the issue hasn't been closed or even acknowledged. Also since the issue has to be posted publicly, my app is now getting random hits from bots.
Admittedly my app is currently only in beta and so only has a hundred or so users, and so it wasn't a major incident for me. If I was getting thousands / millions of hits, maybe either Google would have noticed the problem themselves earlier, or they would have paid more attention to my bug report.
On your point 3, even my small app with a small amount of traffic throws occasional data store errors (even during times which aren't reported on the availability charts as outages).
Having said this, I still like GAE (I am using the Python version), and plan to stick with it. The promise of GAE is its scalability - although it falls over occasionally now for my small traffic, it shouldn't fall over any more when it scales to much more traffic (ie your point 2), provided I've coded it correctly to avoid contention. I'll see how it goes.
Finally regarding your point 1, the blobstore and/or static files are more like a CDN on GAE, than the datastore. However for very large amounts of traffic, a real CDN may be cheaper. It's also not necessarily a CDN, see Google app engine & CDN.

Related

Should I use GAE + Lift for my Scala based webapp?

Such questions have been asked before - but all of the answers are outdated now.
I am looking forward to work on a Scala based webapp. I understand this question can be split into two, but I am posting them as one because they rely on same context, there being a dependency on the hosting platform and frameworks used.
I have read multiple (awesome) debates on Play! and Lift, but cannot find a good comparison between Play! 2.1 and Lift. How do I decide which one is better for my scenario (a social network website) ?
Similarly, this discussion has some very good arguments as to which platform to use for if I go with Lift, but it's from 2010 and seems outdated. The recommended provider (stax.net) is dead (or I guess it's merged with cloudbees.com). I am personally inclined towards GAE, as they are quick to start with, but unsure if the issues still prevail :
Support for actors (I am not sure if Akka helps us solve this problem)
Requests for a given session being served by different JVMs without notice to running app
Quoting David Pollak (lead author of Lift) :
GAE is slow and non-scalable, despite Google's claims (everyone I've
spoken with that have tried to scale GAE apps have failed and gone
elsewhere). GAE locks you into a tremendously suboptimal storage
mechanism. GAE is free, but so is Stax and there are many inexpensive
options including SliceHost. Next up, you've got Amazon EC2 and
RackSpace. So, I haven't found a good reason for anyone to use GAE.
And if there's no good reason to use GAE, devoting a pile of resources
to code around the GAE JVM incompatibilities (e.g., no new threads)
seems like a waste.
Another issue if I go with GAE is lack of Play! 2.1 support. I still don't see a module for that. Another issue is difficulty to migrate to other databases (although I hear migrating to MongoDB should be relatively easier) in the future. Worst case would be to move out of GAE and use AppScale.
Personally I use Lift, Cloudbees, and MongoLab as my first choice for most of my projects. I tried several cloud hosting services to no avail (Heroku and RedHat in particular. I don't think I tried GAE due to the post from David Pollak that you have already referenced). To use cloudbees, you just need an sbt plugin. Then it is as easy as running the cloudbees-deploy target. Within a minute, your code is up and running. I was floored by how easy it was. I went with Mongo primarily because of this excellent g8 template (note, there is now an SQL equivalent)
Another thing I really like about Cloudbees and MongoLab is they both have free services. It's great for me because I only work on these projects in my free time, so I don't want to spend any money while my ideas are half-baked.
As for Lift, I can't compare it much to Play. I downloaded/installed play and was immediately turned off by how MVC it is. I felt that the view-first approach, albeit foreign to me, seemed to be a much more intuitive and powerful way to build web applications. I love how Lift doesn't obscure from me the fact that I am indeed developing a web application. I often feel that MVC frameworks try to keep all of the HTML/CSS/JS etc at an arms-length.
The question is quite open so I will share my experience and opinion regarding Scala web app development as it might help you with your decision.
I built my first scala web application using Scalatra and Scalate using Jetty as the server. The app is hosted on an Amazon EC2 instance and I've had no problems with this... it's been running since the end of 2011 with only one small blip that took 10 mins to resolve. I found it a good experience for learning to use Scala in web applications.
http://www.scalatra.org/
Typesafe (http://typesafe.com) appear to have opted for the Play Framework and so for my next scala based web app I am likely to go for Play. A book I have been reading on the Play Framework is "Play for Scala". It has just been published this month (Oct 2013).
http://www.manning.com/hilton/
My impression is that Lift was the go-to framework in the past but that this has shifted to the Play Framework.

Google apps engine vs virtual/dedicated server

Hi I roll out some site on google apps engine but I'm not really happy about it for this reason:
django is not fully supported so no administration etc.
using django patch etc. sometimes give me some problem when new release on gae is out
a registering domain (with go daddy) and add it to google apps engine (no redirection)
seems take too times for display page
waste times to avoid some restrictions eg no sql, query number limited etc.
little scare because I've no great traffic but is rising up so I can predict how much
really cost
so my question is for your experience is better a slicehost style server (some dedicated server cost $50 and give you 1,5TB ) or other or google apps engine is economic and pro.
just for getting idea my site is yappiedo.com and sometimes seems slow
other Idea and suggerstions are welcome
thanks
In grab bag format:
As long as you have a model for making money off your visitors, you do not need to be scared about rising costs. See this blog for what happens when this guy was hit with major traffic. The costs are reasonable (for him at least) and any other non-GAE solution would have died a screaming death, costing him many thousands of dollars in lost revenue.
You are not working on a traditional relational system. You need to design your application to work with GAE's strengths, not to fight GAE's weaknesses. If you spend all your time working around datastore limitations, take your data model back to the drawing board and make one that plays well with GAE. De-normalization is usually the key word.
There are significant slowdowns associated with starting an instance of your application the first time. These go away when your app is hit often enough that it stays in memory
The should be no difference between hitting *.appspot.com and a properly setup google apps domain. Check your DNS configuration.
And the answer is: depends.

Google App Engine - When to use it, when not too?

It's still unclear to me when I should or should not use Google App Engine to deploy a commercial web application.
It appears Google has "business" level support.
http://code.google.com/appengine/
Can someone bullet list when I should use Google App Engine and when I shouldn't use it for a web application
The question is surprisingly simple to answer after I've had a stab at google engine with my project for a few weeks. You should use it when:
you can't be arsed to set up a server
you want instant for-free nearly infinite scalability
your traffic is spikey and rather unpredictable
you don't feel like taking care of your own server monitoring tools
you need pricing that fits your actual usage and isn't time-slot based
you are able to chunk long tasks into 30 second pieces
you have the skill/will/desire to work with noSQL and deal with the consequences thereof
you are able to work without direct filesystem access
So actually, you can use it pretty much for anything, especially websites. The only thing it very quickly becomes too pricey for is running large background processes. If you're doing some hardcore number crunching 24/7 you're better off using your own server somewhere because no cloud service can really live up to that.
But think of it this way, where else are you going to get an architecture that can swallow 10+ requests per second load for ten dollars a month?
Basically it boils down to this: If you want to focus on developing your code, not your server architecture. GAE is for you. (unlike amazon which behaves more like a fancy VPS)
I can't really tell you whether you should use App Engine without knowing anything about what you need your web app to do, but I will tell you what App Engine can and can't (or won't) do.
App Engine is fantastically good at scaling. It is, in fact, designed to scale web apps to ridiculous lengths first and foremost, with ease of use and number of features being secondary goals.
That's not to say that App Engine doesn't have features, or isn't easy to use, just that if there ever becomes a choice between adding a feature and staying scalable, the App Engine team will choose scalability.
For example, App Engine doesn't have some of the features of a relational database, because those features don't scale to the size of an app that App Engine is designed to support. App Engine doesn't support requests taking more than 30 seconds, because App Engine is designed to serve a web app, not process long-running requests.
In general, when App Engine doesn't support something, it's not because it's impossible -- nothing is impossible -- but rather because it would detract from the scalability of App Engine.
There are workarounds that can be (and have been) implemented to get around this, particularly with things like the task queue, and App Engine is constantly getting new features and new frameworks built on top of it.
App Engine for Business adds SLAs and different pricing, but is otherwise pretty much the same App Engine.

Pros & Cons of Google App Engine [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
[An Updated List 21st Aug 09]
Help me Compile a List of all the Advantages & Disadvantages of Building an Application on the Google App Engine
Pros:
No need to buy servers or server space (no maintenance).
Makes solving the problem of scaling easier.
Free up to a certain level of consumed resources.
Cons:
Locked into Google App Engine ?
Developers have read-only access to the filesystem on App Engine.
App Engine can only execute code called from an HTTP request (except for scheduled background tasks).
Users may upload arbitrary Python modules, but only if they are pure-Python; C and Pyrex modules are not supported.
App Engine limits the maximum rows returned from an entity get to 1000 rows per Datastore call. (Update - App Engine now supports cursors for accessing larger queries)
Java applications may only use a subset (The JRE Class White List) of the classes from the JRE standard edition.
Java applications cannot create new threads.
Known Issues!! : http://code.google.com/p/googleappengine/issues/list
Hard limits
Apps per developer - 10
Time per request - 30 sec
Files per app - 3,000
HTTP response size - 10 MB
Datastore item size - 1 MB
Application code size - 150 MB
Update Blob store now allows storage of files up to 50MB
Pro or Con?
App Engine's infrastructure removes many of the system administration and development challenges of building applications to scale to millions of hits. Google handles deploying code to a cluster, monitoring, failover, and launching application instances as necessary.
While other services let users install and configure nearly any *NIX compatible software, App Engine requires developers to use Python or Java as the programming language and a limited set of APIs. Current APIs allow storing and retrieving data from a BigTable non-relational database; making HTTP requests; sending e-mail; manipulating images; and caching. Most existing Web applications can't run on App Engine without modification, because they require a relational database.
Pros:
Scalable
Easy and cheaper (in short term).
Nice option for start-ups/individuals.
Suitable for apps that just store and retrieve data.
Cons:
Not suitable for CPU intensive calculations. They are slower and expensive.
Scalability doesn't matter much cuz if an app works at Google scale then probably it makes enough money to run on its own servers.
They have lots of limitations thrown here and there, as a result deep data analysis is difficult. Like you cannot produce a social graph using GAE.
I would say its not meant for serious businesses and expensive in long run.
(A huge new) PRO: GAE now supports MySQL :
https://developers.google.com/cloud-sql/
Pros:
built-in ui for unified logs
built-in web interface for task queues
built-in indexes on list of primary objects.
Cons:
loose logs very fast
VERY expensive
VERY expensive
VERY expensive
Un-hackable. Scales because you're obligated to code in a way that scales.
Longer development cycles. Sometimes you just want to hack something together and throw it away after 5 hors. With appengine you have to proper code it and write a lot of stuff to make it sure it scales. You can't just do a "find . | grep .avi | xargs ffmpeg -compress ...." :)
You will loose hours trying to do the simplest tasks like sending push notifications to APNS (iPhone). Although it's fine if you only want to support android in the future.
Terrible to make cleanups on the database. It's a HUGE pain in the ass to fix rows in the database, mainly because terribly slow, but it also requires a lot of code to loop properly within it's time constraints.
It was a pain to port Lucene to work on it's "filesystem".
Slow for what you pay.
Even MORE expensive if your app has spikes of traffic. My app has those spikes if a user that has many followers makes an action and we have to push notifications to his followers. Because of that I have to keep 10 inactive servers always on ($$$$$) to handle spikes.
Appengine isn't too bad due to the fact that I have the option to burn $$$$ instead of being concerned about scalability and fixing bottlenecks to reduce server usage. Sometimes it worth it.
My advice to people starting new products is to go with hetzner.de which is where I host my other products servers. It's cheap and extremely hackable. I have one server at hetzner that is handling 3x more traffic than the product that I have on appengine. The difference in price is $100 a month versions $2700 a month!
I have system admin experience, so the bottom line is that I would never choose appengine over having my own ROOT server. Don't be that bored software engineer wanting to experiment new things instead of building great products!
Pro: Unlimited scalabity to your application and scales with demand.
Con: Not available in some countries (Argentina).
Edit
Available worldwide, but only through Google Groups for App Engine.
When assessing pros and cons, I think it is important to clarify the market for which one is representing. Developers looking for a cost-effective solution to help them with the steep part of their planned hockey-stick growth curve will weigh heavily the cons already listed. For a small business owner, however, GAE is a God-send. These folks most often are looking to "the cloud" as a means to more effectively run their business (i.e. sell physical product and services). For the SMB, GAE the pros already listed can be much more valuable compared to the hockey-stick seeking dev, whilst the cons weight in at a fraction of the devs' measure. I don't see the GAE team doing anything related to SMB positioning, so I guess answers like this are me just pulling on Superman's cape, and spitting into the wind. Really GAE should be absolutely ruling the SMB space now. If not (I have no insights re: user base), then its is a greatly lamentable failure.
I believe , GAE is yet to mature in terms of providing the basic features for serious business such as Datastore with complex primary key, java.awt.* support, these are just a few I'm naming.
Other than the free space and to build some "Hobby" websites, I strongly feel GAE is NOT the place java guys should looking into.
I'm having applications built on the JSP/Servlets and MySQL, thinking about migrating to GAE, but I find I will be spending more "value time" on the migration than just buying a space from some java hosting provider such as EATJ, etc (Sorry not marketing, just an experience).
Another big issue I've got is migration of my existing mySQL data into GAE, bulkupload is really pathetic and has no client support.
No support for Local Db to Server DB upload.
Once the GAE is ready with "all the Cons" mentioned by above, then I'll think we can look in to this migration.
You are force to own a cell phone line, and your country+carrier must be able to receive international SMSs.
(I hate cell phones, and my mom's or co-workers won't get the SMSs)
Con: No Other RDBMS or NoSQL databases are not possible ....
Con: All your base are belong to us
... On a serious note:
Con: You don't control the environment your application runs in. The same cons as with outsourcing any component. Fun for toys, not for business (yet) IMHO.
Various things like API for Google proprietary backends such as their database system and other 'lockdowns' and frameworks that mean your code is tied, in some loose sense to their system can create cost issues later if you want to migrate from GAE. Of course, you could abstract these.
I like GAE, AppJet and others. They are cool. But everything has its place. If you want freedom and the ability to control your language's modules, API, syntax/stdlib versions and whatnot ... don't relinquish control to a service provider.
The lack of standards for environments and specifications for what your app can expect worries me in the cloud arena.
common sense stuff really.
Con: Limited to Java and Python

Is Google App Engine good for scalablity and portability?

I'm evaluating hosted production environments and currently have interest in Google App Engine.
Currently I'm enjoying the free quotas. I'm concerned if it is efficient to scale up using
Google App Engine. Portability is being analyzed as well.
Please advise if Google App Engine is good for scalability and portability.
Thank you in advance.
Portability is guaranteed by the fact that Google has open-sourced all the parts of App Engine that live "in front of" the RPC layer, thus facilitating the work (which would happen anyway of course!-) of third party like appcelerator and bdbdatastore that implement compatible environment running on different infrastructure -- you only need to stay on Google's systems if Google gives you better ROI for your apps, else can easily migrate them to alternative implementations (I'm sure many more third-parties will join the ranks of these two, offering a variety of such alternatives).
Scalability, when the apps are programmed appropriately, seems proven eg. by the Obama's Town Hall Meeting example -- the app, using an open-sourced Google codebase known as "Moderator", was handling 700 QPS for a total of many millions of visits in a few hours, and maintaining excellent latency and impeccable uptime.
A LOT has been written (and recorded on video) about the right techniques needed to obtain such seamless scalability with App Engine -- there's really no way to summarize all of the hits in this google search! Suffice it to say, it's not trivial, but in the end it's easier (for suitable kinds of apps, at least -- ones that are "front-end heavy" as opposed to ones focused on huge "batch" jobs) than with any other technology I know of.

Resources