I have an application on GAE which does not use any database features, nor do I plan to do so in the future. In Application Settings it says Master/Slave Replication Deprecated!.
My question: Should I still migrate my application to High Replication Datastore for some reason or doesn’t it make any difference?
One thought: What if the database server connected to my application is down? Will my application be stopped too, if it’s not HRD? There might be other issues/considerations, too.
Another thought: It sounds like the SLA is only available for HRD-applications.
Yes, you should definitely upgrade anyway.
Not only will you get the SLA that's only for HRD, you will also have your app served from the newer datacenters. Plus, even if you aren't directly using the datastore, if you are using sessions or task queues, those services use datastore internally to store state for you.
Also, I noticed an improvement on memcache performance after upgrading... So, yes. Do the upgrade, the M/S datastore is deprecated because the HRD infrastructure is built on newer, better technology.
Related
I would need to automate the creation of new App Engine projects. Is this possible? I see there is a Google Cloud SQL Admin API which can create new Cloud SQL instances, but what about App Engine? Is there anything similar?
Update:
We have developed an application that runs on GAE and uses Cloud SQL and plenty of API integration with most of Google Apps. We foresee dozens, if not hundreds, of customers in a near future. All of them will be using their own Google domain and Google Apps.
While we could actually just deploy the application in our App Engine and modify the Cloud SQL tables to include the id of the customer who owns the record, we thought it would be better if we deploy an app instance and Cloud SQL for every one of them (on our own account). The main reasons coming to mind are that we can track how much every customer spends in terms of billing, and speed up the database since Cloud SQL is just a MySQL instance.
Steps for the creation would require editing a properties file in the packaged .war file, adding the certificate used to log in as a service account, and probably something that I am missing at this moment :-P
This question is somehow related Create an App Engine Project ID command line
As far as I know this is not possible (and is unlikely to be possible anytime soon).
Update:
I can see why splitting into separate projects for billing purposes would be really nice (multi-tenancy is great, but getting one bill per customer from Google sounds easier), but unfortunately I don't think that it's going to be your best option.
For AppEngine, you may want to look into the multi-tenancy features (or in Python) and how to get stats for billing.
Keep in mind however, CloudSQL is not simply a MySQL instance. It happens to speak MySQL but is not the same as running MySQL on Compute Engine for example. I would recommend that you run some benchmarks to be sure that the "adding the customer ID to the table" idea you had won't work.
Lastly, a possibly relevant read: http://signalvnoise.com/posts/1509-mr-moore-gets-to-punt-on-sharding
I guess the conclusion is that there’s no use in preempting the technological progress of tomorrow. Machines will get faster and cheaper all the time, but you’ll still only have the same limited programming resources that you had yesterday.
If you can spend them on adding stuff that users care about instead of prematurely optimizing for the future, you stand a better chance of being in business when that tomorrow finally rolls around.
I am trying to decide which add-on DB to use with my application when I deploy it on AppHarbor. I've two choices: JustOneDB or Cloudant. I am planning to develop a web and mobile application, which will should work with Terabytes of data.
I am searching for the easiest solution to deploy my database, without me needing to partition the DB and the tables. I want a DB that can handle a very large amount of data, but takes the sharding and partitioning architecture building away from the developer.
I also want a solution that will allow me to easily backup my large database and easily restore it.
From what I've read, Cloudant and JustOneDB are the two most popular ones, and those are available as add-ons on AppHarbor for easy deployment.
I need your recommendations on which one I should go with, the cons and pros of each one. I am developing my application in ASP.NET and C# inside Visual Studio.
There's a recent post on the Cloudant blog about using the MyCouch .Net library with Cloudant databases:
https://cloudant.com/blog/how-to-customize-quorum-with-cloudant-using-mycouch/
Cloudant also offers free hosting up to a greater than $5 bill and can work with Apache CouchDB's replication if you want to develop locally and sync it to the cloud for production/deployment. Multi-master replication isn't something many other databases offer.
Best of luck with your application!
MyCouch.Cloudant was just released. Except from CouchDb and Cloudant core feature support the MyCouch.Cloudant NuGet package adds support for Searches. There will be more Cloudant specific features added to this. It's written in C# and supports .Net40, .Net45 and Windows store apps.
You will find more info about MyCouch in the GitHub repo.
You should probably also consider MongoDB and RavenDB.
If you're just starting out, your first concern should probably be to find a database that'll let you quickly get started and build the application you have in mind. When the application becomes a success and actually attracts terabytes of data, you can start worrying about how to scale it. If the application is soundly architected, adapting it to use an appropriate datastore should not be a monumental task.
Comment removed by originator.
Note: (I have investigated CouchDB for sometime and need some actual experiences).
I have an Oracle database for a fleet tracking service and some status here are:
100 GB db
Huge insertion/sec (our received messages)
Reliable replication (via Oracle streams on 4 servers)
Heavy complex queries.
Now the question: Can CouchDB be used in this case?
Note: Why I thought of CouchDB?
I have read about it's ability to scale horizontally very well. That's very important in our case.
Since it's schema free we can handle changes more properly since we have a lot of changes in different tables and stored procedures.
Thanks
Edit I:
I need transactions too. But I can tolerate other solutions too. And If there is a little delay in replication, that would be no problem IF it is guaranteed.
You are enjoying the following features with your database:
Using it in production
The data is naturally relational (related to itself)
Huge insertion rate (no MVCC concerns)
Complex queries
Transactions
These are all reasons not to switch to CouchDB.
Of course, the story is not so simple. I think you have discovered what many people never learn: complex problems require complex solutions. We cannot simply replace our database and take the rest of the month off. Sure, CouchDB (and BigCouch) supports excellent horizontal scaling (and cross-datacenter replication too!) but the cost will be rewriting a production application. That is not right.
So, where can CouchDB benefit you?
I suggest that you begin augmenting your application with CouchDB applications. Deploy CouchDB, import your data into it, and build non mission-critical applications. See where it fits best.
For your project, these are the key CouchDB strengths:
It is a small, simple tool—easy for you to set up on a workstation or server
It is a web server. It integrates very well with your infrastructure and security policies.
For example, if you have a flexible policy, just set it up on your LAN
If you have a strict network and firewall policy, you can set it up behind a VPN, or with your SSL certificates
With that step done, it is very easy to access now. Just make http or http requests. Whether you are importing data from Oracle with a custom tool, or using your web browser, it's all the same.
Yes! CouchDB is an app server too! It has a built-in administrative app, to explore data, change the config, etc. (like a built-in phpmyadmin). But for you, the value will be building admin applications and reports as simple, traditional HTML/Javascript/CSS applications. You can get as fancy or as simple as you like.
As your project grows and becomes valuable, you are in a great position to grow, using replication
Either expand the core with larger CouchDB clusters
Or, replicate your data and applications into different data centers, or onto individual workstations, or mobile phones, etc. (The strategy will be more obvious when the time comes.)
CouchDB gives you a simple web server and web site. It gives you a built-in web services API to your data. It makes it easy to build web apps. Therefore, CouchDB seems ideal for extending your core application, not replacing it.
I don't agree with this answer..
I think CouchDB suits especially well fleet tracking use case, due to their distributed nature. Moreover, the unreliable nature of gprs connections used for transmitting position data, makes the offline-first paradygm of couchapps the perfect partner for your application.
For uploading data from truck, Insertion-rate can take a huge advantage from couchdb replication and bulk inserts, especially if performed on ssd-based couchdb hosting.
For downloading data to truck, couchdb provides filtered replication, allowing each truck to download only the data it really needs, instead of the whole database.
Regarding complex queries, NoSQL database are more flexible and can perform much faster than relation databases.. It's only a matter of structuring and querying your data reasonably.
Is it possible to run Google App Engine Development Server on my own server? How well development server datastore can handle high load and what amount of data will cripple it?
Some options for running an App Engine app without App Engine:
TyphoonAE, which runs Python apps using a stack of popular open-source components
appscale, which runs Python or Java apps off of Amazon's EC2 cloud
I haven't tried either. See this question for some additional discussion of both.
How well will the datastore perform if you simply spin up dev_appserver.py on a public IP? If you have a lot of data, poorly. When using the dev server, the entire datastore is held in memory, so as you insert data, Python's memory usage will climb. Once you've added enough data to cause your system to start swapping, your app will become unusably slow. There's an option in the dev server to use a SQLite datastore stub instead of the in-memory stub. This makes performance tolerable with large amounts of data, but it's not nearly as efficient as the production datastore, so datastore access is relatively slow even with small amounts of data. Certainly much slower than the in-memory datastore with small amounts of data.
Running the dev server as a stand-alone production server is just generally a bad idea. The API stubs provided with the dev server are designed for use by developers, not users. E.g. sending mail just writes a log entry instead of actually sending mail; logging in as an administrator entails clicking a checkbox that says "log in as administrator".
If you want to move an existing app off App Engine, use one of the options above. If you're developing an app from scratch, use Django or some other framework that's designed to run on generic hardware. The development server is intended for just that: development.
YES, with a lot of missing features (parallel queues, cron jobs, mail, XMPP,..), some hidden security issues, poor performance and stability, it is technically possible.
As you probably guessed, it's a bad idea.
Take for example the HTTP server; using the development server you would put in Production an undocumented BaseHTTPServer, quite impossible to configure and with probably some hidden security flaws ready to be exploited.
As #Drew well said, there are better choices out there to run you Google App Engine code in a Production ready environment that is not GAE.
Although this is 2y+ old thread, just adding my info: http://www.jboss.org/capedwarf
I recently had to realize a project on Google AppEngine. At the beginning I was sceptic. But there a some really nice approaches on Appengine:
No server setup. Everything works out of the box. Gzip, libraries, etc.
One-Click-Deployment. Fire up GAE Launcher on the Mac and click DEPLOY. Done.
Low costs
Easy in-production-logging
But there are some things I don't like if I'm thinking about professional projects
The blobstore. It's just... weird. And non-backupable
All the 1 MB restrictions
The feeling that your code will only run on AppEngine. (BigTable)
Do you know any similar alternatives to AppEngine? And I don't mean services like EC2.
You can have a look at AppScale
Its an open-source implementation of AppEngine which you can deploy on your own machines, with a host of databases to choose from.
I think Heroku is a great alternative.
It can run most of GAE existing apps, since it supports django, but also:
It supports Ruby (w or w/o Rails), Java (w or w/o Spring), Node.js, Clojure,...
It has a strong CLI support (git push for publishing, creation of apps, scaling, log, ps,...)
It supports MySql and PostgreSQL (and, so on, MongoDB, Amazon RDS, etc.)
It has a free tier for 750 hours a month (~1 machine always up) for every app.
It has a collection of addons for providing cloud services as a resources for the apps
It has an add-on program to develop your own add-ons.
Really, it is a good alternative.
If you want your application is not binded to GAE, the best approach is to use well-known langs and well-known persistence providers. Ruby+PostgreSQL, for instance, could be a combination very portable. Django as well, but w/o BigTable...
AppScale and TyphoonAE are both third-party implementations of the App Engine platform. TyphoonAE is targeted at small-to-medium scale, while AppScale is targeted at the large-scale end of things.
As far as backing up the blobstore goes, this is quite doable: just use the built in handler to serve blobs, and, in conjunction with remote_api, you can download your blobs just fine.
I almost hate to mention Microsoft in a Google-related question, but I'm completely vendor-agnostic. So, I'll offer Microsoft's Azure as a platform that offers many similarities to AppEngine but enough differences that it might fit as a good answer to your question.
Azure and AppEngine are similar in that they are both designed to allow you to build easily scalable applications. Azure gives you Microsoft's standard web toolkit options: C#, VB.NET, ASP.NET ASP.NET MVC, but also offers PHP. It has a NoSQL, document database like AppEngine but also gives you the option of choosing a more standard instance of SQL Server. Although I haven't used it myself, it looks like AppEngine for Business now offers SQL, too.
Azure gives you a ready means of having long-running background processes. AppEngine does not to the best of my knowledge.
From my perspective, AppEngine has the huge advantage of charging you for usage only when a request is actually being processed. An Azure instance causes you to get billed even for time that it is completely idle. That's entirely typical, but the fact that Google doesn't so it that way makes me choose AppEngine every time. My budget is too tight to allow me to spend money for idle CPU hours.
There's a port of django to non-relational databases that works with app engine or mongodb.
google for django non-rel
documentation is a bit sparse though